Path to mastering Natural Language Processing (NLP) with Deep Learning

In this post, I’m going to provide a guide for how to master natural language processing with deep learning.

My goal is to do it “backwards” – to start with some topics that might be your goal – and then tell you all the steps required to get there.

Natural language processing (NLP) deals with text and speech. Knowing how to deal with text is vital because it is the most abundant source of information. Anything you want to remember for later, you write it down as text. Anything you want to share with others, you write it down as text.

Text is the most important medium for information exchange between people.

Dealing with text is also extremely important for genomics (DNA). In particular, it can be thought of a translation problem. When we want to translate from one language to another, we have to extract meaning. Understanding DNA requires understanding the language of the genome, and converting that into a language we as humans can understand.

DNA is essentially the “programming language” of nature. Going along that theme, our current NLP models can understand actual programming languages as well. Github Copilot is a new product based on GPT-3 that produces AI-generated code.

Other applications of NLP in the real-world:

  • Spam detection
  • Sentiment analysis
  • Machine translation
  • Chatbots and assistants
  • Question answering
  • Text summarization
  • Information retrieval
  • Article spinning
  • Spelling correction
  • Search engines
  • Topic modeling

Every time you use Siri, Google Assistant, Amazon Alexa, etc… that’s NLP at work!

So where can you learn about all these cool topics? Great question!

My course Deep Learning: Advanced NLP and RNNs covers topics such as:

  • Neural machine translation
  • Speech recognition
  • Question answering
  • Text classification
  • Chatbots
  • Generating poetry

I show you how you can even do image recognition with Bidirectional RNNs!

We go over important modern deep learning algorithms such as seq2seq (sequence-to-sequence) and attention.

In this course, we also review basic recurrent architectures such as the LSTM (long short-term memory) and GRU (gated recurrent unit).

But where you can you get a “first principles” look at what these are, why we need them, and how they work?

Great question!

Luckily, I have a course on just that:

Deep Learning: Recurrent Neural Networks in Python

You’ll also notice that Deep Learning: Advanced NLP and RNNs also makes use of the important concept of word embeddings.

In Natural Language Processing with Deep Learning in Python, we covered word embeddings in-depth.

You learn about famous word embedding algorithms such as word2vec and GloVe, as well as how to use RNNs for NLP tasks, and a state-of-the-art architecture for sentiment analysis called Recursive Neural Tensor Networks (RNTN).

These are neural networks structured like trees – I like to call them “tree neural networks”.

In Data Science: Natural Language Processing in Python, we cover NLP from a very basic perspective: that it’s simply applying machine learning to text.

The idea of this course is to be beginner-friendly by covering basic tasks almost anyone can understand:

  • Spam detection
  • Sentiment analysis
  • Latent semantic indexing (of SEO industry fame)
  • Article spinning

A big component of the above courses is language modeling.

Language modeling is an unsupervised task, and we first discussed unsupervised deep learning in my course:

Unsupervised Deep Learning in Python

This course teaches you about important ideas such as shared weights, dimensionality reduction, latent representations, and data visualization. We also cover algorithms such as the Autoencoder (the nonvariational kind), RBM (Restricted Boltzmann machine), t-SNE (t-distributed Stochastic Neighbor Embedding), and PCA (Principal Components Analysis).

All of the above courses necessitate knowing about how a neural network works and how to build one in the first place.

It’s not something you’re just going to up and do in one day.

These days, to build neural networks, we use modern deep learning libraries such as Theano, Tensorflow, and PyTorch.

So where can you learn how to build a neural network using these modern libraries?

Well, I’m glad you asked!

I just so happen to have a course on that too.

Modern Deep Learning in Python

This course covers (as mentioned above) how to build neural networks in modern deep learning libraries such as Theano, Tensorflow, and PyTorch.

It also covers modern theoretical advancements, such as adaptive learning rate methods (such as RMSprop, Nesterov Momentum, and Adam), as well as modern regularization techniques such as Dropout and Batch Normalization.

These can all be thought of as “add-ons” to the vanilla backpropagation training algorithm.

Modern libraries like Theano, Tensorflow, and PyTorch do “automatic differentiation” and make use of the GPU to greatly speed up training time.

But wait!

What the heck is backpropagation? And how is a neural network “trained” in the first place?


This is where Data Science: Deep Learning in Python enters the picture.

This course goes over, in painstaking detail, how to train a neural network from basic first principles.

You’ll see how basic mathematics – matrices, vectors, and partial derivatives – form the basis of neural networks.

You’ll learn about what it means for a neural network to “make a prediction”, and also what it means to “train a neural network”.

You’ll learn how to visualize what a neural network does, and how to interpret what a neural network has learned.

A “neural network” implies a network of neurons.

At this point, you might be wondering, “what is a neuron anyway?”

You guessed it – I’ve covered this too!

Deep Learning Prerequisites: Logistic Regression in Python

A “neuron” is actually a classic machine learning model also known as Logistic Regression.

In this course, you’ll learn the ins and outs of linear classification and how to train a neuron – an algorithm known as gradient descent (like a baby version of backpropagation, in some sense).

What does it mean for a model to be “linear”?

Since you asked, I’ve got this covered too.

Deep Learning Prerequisites: Linear Regression in Python

You may have noticed that all of these courses have a heavy reliance on writing code.

A huge part (maybe the most important part) of learning how these models work, is learning how to implement them in Python code.

In particular, we make heavy use of libraries such as Numpy, Scipy, Matplotlib, and Pandas.

You can of course, learn how to use these libraries in my always-free course:

Deep Learning Prerequisites: The Numpy Stack in Python

Since a lot of people requested it, I also added a special section to the course that covers Machine Learning Basics, to answer questions such as “what is classification?” and “what is regression?”, as well as to gain a very rudimentary understanding of machine learning by using Scikit-Learn.

I consider my free Numpy course the basic starting point to deep learning and machine learning, no matter what field you want to end up specializing in, whether that be computer vision, natural language processing, or reinforcement learning.

These libraries are the basic tools (like the screwdriver, hammer, ruler, …) that you will use to build bigger and more complicated systems.

Keep in mind, there are many more topics in deep learning and artificial intelligence than what I listed here. For a full list of topics, and a guide for what order to learn them in, please see my handy visual guide: “What order should I take your courses in?”