Natural Language Processing

Inderpal Singh
5 min readJun 4, 2021

--

Natural Language Processing(NLP) is the branch of artificial intelligence that deals with the interaction between humans and computers using the natural language. “Natural Language” means a language that is used for everyday communication by humans such as English, Hindi or Spanish. In contrast to artificial intelligence, natural languages have evolved as they pass from generation to generation and are hard to pin down with explicit rules.
Technologies based on NLP are becoming increasingly widespread for example- phones and handheld computers support predictive text and handwriting recognition; machine translation allows us to retrieve text written in Chinese and read them in Spanish. By providing more natural human-machine interfaces and more sophisticated access to stored information, language processing has come to play a central role in the multi-lingual information society.

Applications of Natural Language Processing in daily life:

Natural Language Processing (NLP) is essentially teaching machines to understand human language and since it is all about human language, we come across multiple applications of NLP in daily life without even realizing! Here are a few examples that everyone would have definitely come across:

  1. Chatbots or Conversational Agents:
    From booking you flight tickets to ordering food, chatbots are everywhere today. Rather than waiting for hours to get their queries resolved, customers nowadays want instant answers and chatbots come in really handy here. Similarly there are also many conversational agents like Alexa, Cortana, Siri and Google Home. They all use NLP internally.
  2. Machine Translation:
    It is a system which uses NLP techniques in collaboration with Machine Learning/ Neural Networks to build systems that are capable of automatic language translation.
  3. Speech Recognition:
    Voice-based personal assistants have become so ubiquitous in the past decade that almost every smartphone user would be familiar with the like of them. Siri, Alexa, and Google Assistant are few of the examples. Every audio that a user speaks is ‘natural language’ and these systems convert audio to text data using NLP techniques.
  4. Text Summarization:
    In the busy world of today, people need byte sized summaries of information to effectively take action on it without indulging time more than necessary. Text summarization is one such application of NLP that is slowly becoming the need of the hour. It is a process of generating a concise, coherent and meaningful summary of text from resources such as books, news articles, blogs, research papers etc. Applications like inshorts use this technique.
  5. Recommendation Engine:
    From online shopping giants like Amazon to streaming services like Netflix, recommendation engines are everywhere. Everyone is trying to provide better recommendations to the user to improve their experience. A recommendation engine tires to understand a user’s needs and interests using the data of the past behavior of the user and filters it using different algorithms to recommend the most relevant item/choice. Nowadays NLP is used in these engines as well.

Applications of Natural Language Processing in the industry:

  1. Sentiment analysis for customer reviews:
    One of the most common applications of NLP is sentiment analysis. From opinion polls to creating entire marketing strategies, this domain has completely reshaped the way businesses work.
  2. Customer support systems:
    Companies like Uber that are large-scale and customer-facing have developed sophisticated systems for customer support requests. With hundreds of thousands of tickets surfacing daily on the platform across 400+ cities worldwide, their team must ensure that agents are empowered to resolve them as accurately and quickly as possible. This is a very interesting use case of NLP in the real world.
  3. Text Mining:
    One of the biggest breakthroughs required for achieving any level of artificial intelligence is to have machines that can process text data. From social media analytics to risk management and cybercrime protection, dealing with text data has never been more important.
    It is the use of computational methods and techniques to extract high quality information from text.
    Where NLP works with any product of natural human communication, including texts, signs, speech etc. Text mining works with text documents. It extracts the documents’ features and uses qualitative analysis.
    The goal is, to turn text into data for analysis, via application of natural language processing (NLP), different types of algorithms and analytical methods.

Tools and Libraries used in Natural Language Processing:

  1. Regular Expressions(REGEX):
    A regular expression is a is a sequence of characters that define a search pattern. They are also used for other useful NLP tasks like cleaning/filtering unnecessary symbols and searching for a given pattern in the text.
  2. Natural Language Toolkit(NLTK):
    NLTK is one of the most popular NLP libraries in Python. It supports a plethora of tasks and can be used to do anything from text preprocessing techniques like stop words removal, tokenization, stemming, lemmatization to building n-grams.
  3. spaCy:
    It is considered as a successor to NLTK and is known as an industrial grade NLP library. It is scalable and uses the latest neural network based models to perform tasks like Named Entity Recognition, Parts Of Speech tagging, Sentence Dependency, mapping, etc.
  4. Gensim:
    Gensim is an open-source library for unsupervised topic modelling and NLP, using modern statistical machine learning. It is extensively used when working with word embeddings like Word2Vec and Doc2Vec.
  5. Fast Text:
    FastText is a library created by the Facebook Research Team for efficient learning of word representations and sentence classification. This library has gained a lot of traction in the NLP community and is a possible substitution to the gensim package which provides the functionality of Word Vectors.

Techniques used in Natural Language Processing:

  1. Part of Speech Tagging(POS):
    Part of Speech tagging is the process of marking up a word in text corpus as corresponding to a particular part of speech, based on both it’s definition and it’s context that is it’s relationship with adjacent and related words in a phrase, sentence or paragraph(nouns, verbs, adjectives, adverbs etc.)
  2. Named Entity Recognition(NER):
    Named Entity Recognition is the process of detecting the real world entities such as person name, location names, company names etc. from a given piece of text.
  3. Topic Modelling:
    Topic Modelling is the process of automatically identifying the topics present in a text corpus, it derives the hidden patterns among the words in the corpus in an unsupervised manner.
  4. Language Modelling:
    Language Modelling is the first crucial step for most of the advanced NLP tasks like Text Summarization and Machine Translation. It involves learning to predict the probability of a sequence of words. It is the same technique that Google uses when it gives search suggestions.
  5. Sequence Modelling:
    Sequence Modelling is a technique of deep learning that is used to work with sequence data like music, lyrics, sentence translation or building chatbots. This technique is used a lot in NLP because natural language or text is essentially an example of sequence based data.

Conclusion:

This blog introduces the Natural Language Processing and gives a basic idea of the different techniques and tools it involves.

References:

https://en.wikipedia.org/wiki/Text_mining

--

--