Predictive algorithms are enabling the examination of huge amounts of text for useful patterns and trends, in a process that is known as ‘text analytics’. While analysis of structured information in Big Data is an easy way of discovering a wide variety of things about your business, text analytics can track far more nuanced insights. It can be used to find out how your customers feel about you and your product or service, and, often more importantly, why they feel that way.
Text analytics take swathes of unstructured text - the kind of amount that a human being could not process - and applies Big Data techniques to them. It has been used for a number of purposes. For example, two PhD students at the Stanford Literary Lab fed the entire content of 2,958 19th century novels through a series of Big Data analytics tools. They drew a number of insights about the real world at the time, noting that words describing action and body parts became more prevalent as the century went on. The researchers concluded that increasing urbanization during the 19th century brought people closer together physically, which made people’s bodies and actions harder to ignore.
There are also a number of practical business implications for text analytics - customer experience management, brand monitoring, compliance, and business intelligence are all improved with the use of text analytics. A recent Allied Market Research study found that retailers are the biggest users, accounting for one-third of the market. They can use it to look for the number of key phrases in online reviews that describe an attribute or aspect of a shop, or looks at tweets and Facebook posts in order to gauge sentiment on social media.
One example of text analytics being used to great effect was during the US election. Rayid Ghani, former Chief-Scientist at Accenture Technology Labs, implemented a massive text analytics program while working under Barack Obama, known a Project Dreamcatcher. Project Dreamcatcher took voters’ own words, noted down on clipboards by canvassers at the door or during a phone calls, in an online signup sequence or a stunt like ‘share your story.’ These were then analyzed to discover what voters were interested in and why, with keywords and context isolated and statistical patterns gleaned from the examples of millions of voters to discern meaning.
There are still obvious issues with text analytics. Many words have different meanings in different cultures. For example ‘bad’ would usually be used to express a negative sentiment. However, in some cultures bad means good, so text analytics tools may end up confusing a negative sentiment with one that is actually positive. In order for text analytics to function properly, there must be a high degree of cultural context. For a computer to understand this context, they must understand every human’s personal experiences. This requires a massive concept matrix, which many firms, such as IBM, are attempting to build with their Watson computer.