A Beginners Guide To NLP

We look at the basics of Natural Language Processing


Natural Language Processing (NLP) is the Artificial Intelligence (AI) method of making computer systems analyze, understand, and generate a natural language, i.e one spoken by a human, such as English or Spanish. Its most infamous application is being used, with limited success, by those competing in the Turing Test - Alan Turing’s criterion of AI which argues that, in order for a machine to be considered AI, a human being should be unable to distinguish it from a person when they look at the replies to questions put to both.

NLP works in two ways. Firstly, by understanding natural language to perform various tasks. It attempts to translate the meaning and structure into a machine representation format, in order to tell the computer what to do. Siri will be one example of this that will be familiar to most iPhone users, as it follows instructions spoken to it and tries to complete them, albeit within limits.

Another strand of NLP is Natural Language Generation (NLG). NLG works by generating language itself, which it does by taking nonlinguistic input data, processing it, and generating natural language back to the user. One example is textual weather forecasts generated from the set of numbers that would be used by meteorologists - temperature, precipitation, wind speed, and so forth.

One of the main tools NLP uses to work is machine learning, which it applies to reams of text to draw conclusion about language. This has evolved with computing, although the idea of NPL’s use predates computers and goes back as far as the 17th century, with a variety of other more manual methods being used. The internet has been a clear boon to NLP, as it has made available so much more text to work.

It is important for NLP to establish the context of each word. Words are ambiguous, and semantics and syntax are often difficult enough for humans to get to grips with, let alone machines. Noam Chomsky’s famous example of ‘colorless green ideas sleep furiously’ illustrates the problem of semantics. He compared that sentence with ‘furiously sleep ideas green colorless.’ Chomsky noted that the first sentence, though nonsensical, is grammatical, while the second is not. For example, a good NLP system will learn that the word ‘hot’, when used in close proximity to the word ‘curry’, will usually mean ‘spicy’, as opposed to ‘hot’ in the more literal sense of burning. However, there is also every chance that it could also mean hot as in burning. One of the ways people most commonly use to thwart machines during the Turing test is by asking the same question twice in different ways.

As a form of AI, the applications of NLP are many, and businesses can easily put it to use. It can help greatly in the processing of large amounts of text. For example, one company applied NLP to Twitter in order to predict where riots were likely to occur next. It can also be used to classify text into categories, as well as index and search, and translate languages. As machine learning progresses, and the technology becomes more finely attuned to the seeming contrariness of many languages, NLP is set to complete these tasks to an even more impressive degree and be an even greater resource to business. 


Read next:

Social TV: Cross-Channel Insights on the ShareThis Platform