Note: I also made a video on this topic – https://youtu.be/t5M6s3zuSg4
In this post, I’m going to provide a guide for how to master computer vision with deep learning.
My goal is to do it “backwards” – to start with some topics that might be your goal – and then tell you all the steps required to get there.
Computer vision (CV) generally deals with using images as input. Videos count as images too, since videos are just a series of images.
You can imagine a robot that walks around with a camera attached to it. Being able to intelligently process images would be a requirement for that robot learning to “see” and react to what it sees accordingly.
It is required in self-driving cars to locate pedestrians, traffic signs, and other vehicles. You cannot have self-driving cars without computer vision!
Another meaningful application of computer vision is medical diagnosis. Research has shown that machine learning algorithms can perform on par with, and sometimes even better than, human experts.
Other popular computer vision applications are GANs (for generating realistic-looking, but fake images), and style transfer (for applying the style of one image to the content of another).
One controversial application of computer vision is “DeepFakes”. With this technology, you can do something like, record a video of yourself saying something, and create a new video making it look and sound like Barack Obama had said it. Scary stuff!
Check out this example:
So where can you learn about all these cool topics? Great question!
My course Advanced Computer Vision covers topics such as object detection, style transfer, and advanced convolutional architectures (such as VGG, ResNet, and Inception).
Object detection algorithms such as SSD (Single-Shot Multibox Detector) and YOLO (You Only Look Once) have a convolutional neural network (CNN) at their center.
So do style transfer networks.
Thus, it’s clear that if you want to learn about advanced applications such as object detection and style transfer, you’re going to have to learn how to build a convolutional neural network first.
My course Deep Learning: GANs and Variational Autoencoders covers generative algorithms such as GANs and Variational Autoencoders (as the title would suggest).
These are used to generate photo-realistic images like the one shown above.
Yann LeCun (Facebook AI Research Director and deep learning pioneer) called GANs “the most interesting idea in the last 10 years in ML.”
Both GANs and Variational Autoencoders are examples of Unsupervised Deep Learning. We covered the principles behind Unsupervised Deep Learning in my course:
Which teaches you about important ideas such as shared weights, dimensionality reduction, latent representations, and data visualization. We also cover algorithms such as the Autoencoder (the nonvariational kind), RBM (Restricted Boltzmann machine), t-SNE (t-distributed Stochastic Neighbor Embedding), and PCA (Principal Components Analysis).
Back to GANs: The most popular GAN architectures (such as DGAN, SRGAN, WGAN, etc.) rely on convolutional neural networks.
Alright, so it’s clear that if you want to learn about all these interesting applications of convolutional neural networks, you’ll have to know how to build a convolutional neural network.
Luckily, I have a course on just that!
As you can tell by the name, CNNs involve 2 things:
- Neural Networks
This course is all about (a) what is convolution? and (b) how can I add convolution to neural networks?
This, of course, necessitates knowing how to build a neural network (without convolution).
These days, to build neural networks, we use modern deep learning libraries such as Theano, Tensorflow, and PyTorch.
So where can you learn how to build a neural network using these modern libraries?
Well, I’m glad you asked!
I just so happen to have a course on that too.
This course covers (as mentioned above) how to build neural networks in modern deep learning libraries such as Theano, Tensorflow, and PyTorch.
It also covers modern theoretical advancements, such as adaptive learning rate methods (such as RMSprop, Nesterov Momentum, and Adam), as well as modern regularization techniques such as Dropout and Batch Normalization.
These can all be thought of as “add-ons” to the vanilla backpropagation training algorithm.
Modern libraries like Theano, Tensorflow, and PyTorch do “automatic differentiation” and make use of the GPU to greatly speed up training time.
What the heck is backpropagation? And how is a neural network “trained” in the first place?
This is where Data Science: Deep Learning in Python enters the picture.
This course goes over, in painstaking detail, how to train a neural network from basic first principles.
You’ll see how basic mathematics – matrices, vectors, and partial derivatives – form the basis of neural networks.
You’ll learn about what it means for a neural network to “make a prediction”, and also what it means to “train a neural network”.
You’ll learn how to visualize what a neural network does, and how to interpret what a neural network has learned.
A “neural network” implies a network of neurons.
At this point, you might be wondering, “what is a neuron anyway?”
You guessed it – I’ve covered this too!
A “neuron” is actually a classic machine learning model also known as Logistic Regression.
In this course, you’ll learn the ins and outs of linear classification and how to train a neuron – an algorithm known as gradient descent (like a baby version of backpropagation, in some sense).
What does it mean for a model to be “linear”?
Since you asked, I’ve got this covered too.
You may have noticed that all of these courses have a heavy reliance on writing code.
A huge part (maybe the most important part) of learning how these models work, is learning how to implement them in Python code.
In particular, we make heavy use of libraries such as Numpy, Scipy, Matplotlib, and Pandas.
You can of course, learn how to use these libraries in my always-free course:
Since a lot of people requested it, I also added a special section to the course that covers Machine Learning Basics, to answer questions such as “what is classification?” and “what is regression?”, as well as to gain a very rudimentary understanding of machine learning by using Scikit-Learn.
I consider my free Numpy course the basic starting point to deep learning and machine learning, no matter what field you want to end up specializing in, whether that be computer vision, natural language processing, or reinforcement learning.
These libraries are the basic tools (like the screwdriver, hammer, ruler, …) that you will use to build bigger and more complicated systems.
Keep in mind, there are many more topics in deep learning and artificial intelligence than what I listed here. For a full list of topics, and a guide for what order to learn them in, please see my handy visual guide: “What order should I take your courses in?”