Simple Introduction To Natural Language Processing

How computers can understand the human language

Photo by Hannah Wright on Unsplash

We often use Google Assistant to search on the web. So we can save our energy typing it out.

The Google Assistant takes the inputs of words in sequence and gives them as an output in sequence.

This is a type of instance in Natural Language Processing.

What Is Natural Language Processing?

Natural Language Processing, usually shortened as NLP, is a branch of artificial intelligence that deals with the interaction between computers and humans using the natural language.

The ultimate objective of NLP is to read, decipher, understand, and make sense of human languages in a manner that is valuable.

A typical interaction between humans and machines using Natural Language Processing

What is NLP used for?

Often used NLP technology in day-to-day activities are

Deep Learning in Natural Language Processing

Neural networks provide powerful learning machinery that is very appealing for use in natural language problems.

A major component in neural networks for language is the use of an embedding layer.

In NLP, the most common part is converting text into numbers.

There are two main concepts for turning text into numbers.

Tokenization: A straight mapping from a word or character or sub-word to a numerical value. There are three main levels of tokenization

Embedding: An embedding is a representation of a natural language that can be learned.

For example, ‘hello’ could be represented by the 5-dimensional vector [-0.8547, 0.4559, -0.3332, 0.9877, 0.1112].

The size of the feature vector is tuneable.

Architecture Of Neural Network

There are mainly two types of neural network architectures, that can be combined in various ways:

Feed-Forward Networks: In particular multilayer perceptrons (MLPs), allow working with fixed-sized inputs, or with variable lengths in which we can disregard the order of the elements. When feeding the network with a set of input components, it learns to combine them in a meaningful way.

MLPs can be used whenever a linear model was previously used. The nonlinearity of the network, as well as the ability to easily integrate pre-trained word embeddings, often lead to superior classification accuracy.

Recurrent Neural Networks (RNNs): These specialized networks are mostly used for sequential data. Whenever we feed input a sequence of items and produce a fixed size vector that summarizes that sequence.

Recurrent Neural Networks are slightly different from traditional neural networks.

Traditional neural networks can’t do this, and it seems like a major shortcoming. For example, imagine you want to classify what kind of event is happening at every point in a movie, It’s unclear how a traditional neural network could use its reasoning about previous events in the film to inform later ones.

Recurrent neural networks address this issue. They are networks with loops in them, allowing information to persist.

In simple terms, it loops through the network and passes information from one step of the network to the next.

Summary

Natural Language Processing is one of the common and popular subfields in deep learning.

Some common applications of NLP are speech recognition, audio to text conversion, Interactive Voice Assistant, Personal Assistant.

The most common part in preprocessing the data in NLP is converting text data into numbers. This can be done with tokenization and embedding.

What tokenization does is, assigns a number to a particular text. Then embedding converts the number into float type.

The most common architectural neural network that is used in Natural Language Processing is Recurrent Neural Networks (loops through the network, and passes information from one step of the network to the next).

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Parvez Sohail

Hey, I am enthusiast in Machine Learning and Data Science. I love to share my work.