Why Data Democratization Needs Natural Language Generation

Visualizations are no longer enough to deal with the volume of data we have today


Natural Language Generation (NLG) has long lived in the shadow of Natural Language Processing (NLP), which has received far more investment and attention. Indeed, the two are often misunderstood by those in business and wrongly used interchangeably. However, NLG is slowly beginning to make a name for itself in its own right. There are now a number of startups operating in the space making it big, such as Narrative Science, and Automated Insights, while established giants like IBM are also driving innovations that are rendering NLG indispensable - particularly for data democratization.

NLP focuses on understanding what ideas are being communicated by analyzing textual data for patterns. NLG, meanwhile, is a branch of AI that communicates the findings and insights discovered by NLP by translating them into natural language. It is integrated into analytics tools to work alongside data visualizations in order to provide mainstream users with a clearer narrative, and we now see examples of it everyday in anything from text-based summaries of intelligence reports to mutual fund performance.

The amount of data that organizations consume and produce has grown exponentially over the last few years, but while it may be comforting to have a wealth of data in reserve, it is only valuable when you do something with it - analyze it, visualize it, etc. Big data has, however, now outgrown the basic dashboards and human data scientists that did this in the past. There is simply too much data and too many insights held within it waiting to be revealed. Machine learning and advanced statistical algorithms are now vital if organizations are going to be able to keep churning them out them, and automated data visualizations to represent the findings. NLG is essentially just a variation on the data visualization which makes it easier to interpret the information. For example, in data intensive industries like finance, workers may have forty different graphs spread across a number of monitors. While these will likely have all the information necessary, it takes considerable human effort and time to unlock the insights they hold. NLG will compare the graphs and provide the worker with advice, explaining the analysis without the need for a highly skilled and expensive data expert so that the task can be completed far quicker and more cost effectively.

This is particularly important because it helps to create a data-driven culture, in which insights are easily accessed and understood by employees throughout the company, regardless of whether or not they have a background in the field. A recent study by MIT Sloan Management Review and SAS ‘The Analytics Mandate’ concluded that an ‘analytics culture’ is the driving factor in achieving competitive advantage from data. David Kiron, executive editor for MIT Sloan Management Review, noted: ’We found that in companies with a strong analytics culture, decision-making norms include the use of analytics, even if the results challenge views held by senior management. This differentiates those companies from others, where often management experience overrides insights from data.’ Data visualization has significantly advanced in the past few years and engaging, attractive graphs are now commonplace. They are an extremely useful way of displaying information for the layman. However, as the information density increases, they get harder to understand and lack the ability to make the same rich, complex narrative as easy to interpret as language does.

Michael White, an associate professor at Ohio State University, for one, believes that this means NLG is finally on the precipice of entering the mainstream. He argues that, ‘There’s growing awareness that masses of data and visualizations are not really helpful if they can’t be explained and made relevant. I’d say the time has finally become ripe for natural language generation to have commercial success.’ And there a number of startups operating in the area that seem to be proving him correct. Narrative Science’s writing software, Quill, for example, is already making waves and has been used everywhere from Wall Street to US intelligence agencies. This is an example taken from an investment report, which shows how Quill has managed to produce text that could have been written by a human, albeit a slightly robotic one:

‘The energy sector was the main contributor to relative performance, led by stock selection in energy equipment and services companies. In terms of individual contributors, a position in energy equipment and services company Oceaneering International was the largest contributor to returns. Stock selection also contributed to relative results in the healthcare sector. Positioning in health care equipment and supplies industry helped most.’

By demystifying data and communicating insights in real time, NLG allows decision makers to react quickly to information that in the past would have taken hours or weeks to obtain manually. This provides a clear competitive advantage for an organization using it. NLG-driven real-time insights take over menial tasks, freeing up business owners, team members, and data practitioners to focus on high-level strategic activities and driving growth. Data needs people to apply it to situations and if you cannot communicate with people and engage them then this is impossible.

To understand and learn more on the power of National Language Generation, attend our Data visualisation Summit in San Francisco this April. Click here to read more.

Read next:

Why Blockchain Hype Must End