A very common beginner question is: “Why do I have to learn all this theory? Can’t we just jump straight to the code?”
Short answer: OH GOD NO.
One common misunderstanding is people often say, “Yes, but I’m a professional, I just want to learn what to do, I don’t care about why it works.”
This is totally not professional at all. Any engineering manager or coworker will regard you with strong suspicion if you have no idea what you are coding, and only copying something from an online course.
This is not a professional attitude. If I interviewed you, and you told me this was your approach, you will NOT be hired. If anyone I know interviews you, you will NOT be hired.
Sorry, but as part of a business that wants to make money, there is no place for such a lazy and incompetent attitude on my workforce.
Simply put: I’d rather pay someone who knows what they are doing.
The problem with such employees is they “don’t know what they don’t know”. They believe knowing any theory is useless (of course, those beliefs are wrong, because their lack of knowledge prevents them from understanding why that knowledge would be useful).
It would be like trying to tell LeBron James how to improve his basketball when you are not yourself a professional basketball player.
When it comes to pro basketball, you also “don’t know what you don’t know”. Maybe you can guess what it takes to become good, but you really have no idea.
Another way of saying they “don’t know what they don’t know” is they are “unconsciously incompetent”. They don’t even know that they are incompetent, which is why they actively try to deny it.
When you have “unconscious incompetence”, you are incompetent, but you don’t know that you are.
In this short article, I am going to give a short description of what “in-depth” means, as it pertains to courses on machine learning and data science.
To do this, I will provide several examples.
Example 1) Typical Udemy Course Example 2) Typical College (“Academic”) Course Example 3) My Courses
Example 1: Typical Udemy Course
A typical Udemy course on machine learning spends ~20 minutes talking about how “cool” and “powerful” an algorithm is (useless).
Then, the instructor spends ~20 minutes talking about “intuition” (maybe a few pictures here and there). Importantly, no math is used, no pseudocode is presented.
Then, the instructor spends ~30-45 minutes explaining 3 lines of code:
model = MyModel()
Why is this bad?
You still have no idea how the model really works. Your “intuition” is often wrong.
Sometimes, even the instructor himself is wrong! (That’s because most instructors teaching these courses are marketers, not practicing data scientists.) They cannot teach the math or algorithms parts, because they don’t understand it themselves.
Look at the code above. Is it even related to any algorithm? This code has no relation to understanding how the model works. As such, it is useless for learning “machine learning”.
You can’t implement the model. As I always say, “If you can’t implement it, then you don’t understand it”. Or, as the famous physicist Richard Feynman once said, “What I cannot create, I do not understand”.
You just paid someone to learn just 3 lines of code and some flakey “intuition”. You could have done that for FREE! Check the Scikit-Learn documentation.
Example 2: Typical College / University Course
If I only had a choice between #1 and #2, I would usually choose #2. Despite the problems listed below, a college course on ML is far superior to the typical “for dummies” Udemy course.
The typical college course proceeds as follows:
Show all the math and pseudocode (usually), but go through them at super speed! Usually, each slide contains like 3-5 equations (way too much packed onto one slide).
Luckily, college ML courses do not use Scikit-Learn (unlike typical Udemy courses). Students do get to practice implementing algorithms (good).
Unfortunately, because the class moves so quickly, they will not get to implement everything. Usually, each lecture will cover 1 or 2 algorithms (suppose there’s 12 lectures per semester, so you will learn ~12-20 different algorithms in the whole course – too many too fast in my opinion).
But, since you can’t have 20 homework assignments (don’t get any crazy ideas professors!), you will likely only get to implement a few (e.g. logistic regression, k-means clustering, neural networks).
The end result: the course goes so fast, that students end up learning very little. They have some idea how each algorithm works, but not in-depth (even though all the equations were shown, it was simply too fast).
Another problem: Because implementation is only a homework assignment, you never get to see a reference implementation. As you probably know, not everyone is an A+ student. Why?
Well, you lose marks when you get things wrong. Students coding these assignments, if they get a B or a C, it means they got some things wrong. However, they passed the course. But they never learned to code the algorithm correctly.
So even if you “pass the course”, you are still missing knowledge!
Example 3: My Courses
In my courses, I aim to fix these shortcomings of both the typical Udemy course and the typical academic course.
How? By going in-depth into each subject.
Don’t cover every machine learning algorithm under the sun. Focus on one or a few algorithms per course.
For each algorithm, derive any equations used from first principles (typically, that means calculus, linear algebra, and probability, unless the course had other prerequisites)
For each algorithm, derive the “algorithmic” part / pseudocode from first principles
For each algorithm, implement it in code from scratch. This requires true understanding. It forces you to understand the equations and not skip over the details. It forces you to think about how the algorithm works, rather than just using flawed “intuition”.
Advantage #1: Anyone should be able to fully understand the material as long as they meet the prerequisites (usually basic calculus, linear algebra, probability, and programming). This is because everything is built up logically from these basic first principles.
Advantage #2: Because everything is derived from first principles, it’s unlike a typical academic course, where too many equations are shown all at once, and it’s not clear how the flow of logic proceeds unless you go through them slowly by yourself. Instead, each equation is shown one-by-one.
If you don’t understand it, then simply pause the video, and review it again, or use the Q&A to clarify your misunderstanding.
In a college class, things go by so fast you rarely have the opportunity to even think of the right question to ask before the lecturer moves on to the next subject.
Advantage #3: You get to implement everything from scratch. This means you have true understanding. Unlike the typical academic course, where implementation is just homework, I am going to show you a reference example of the code. So even if you get it wrong, you still have a chance to fix your mistakes.
So what does “in-depth” mean?
Deriving all the math (unlike Udemy courses)
Deriving all the algorithms and pseudocode (unlike Udemy courses)
Showing the implementation of the code (unlike academic courses)
Common Beginner Mistake
Many beginners confuse the word “depth” with “breadth“.
For example, a beginner may come across a 20-40h course and proclaim, “This course covers so many topics! It is very in-depth!”
In actuality, it’s the opposite of deep. It’s shallow.
Spending 10 hours on basic Python, 1 hour on linear regression, 1 hour on logistic regression, etc. etc. means you only learned very basic stuff.
You didn’t learn anything “deep”. You only have superficial understanding.
Beginners often think that to be “in-depth”, a course must cover many different algorithms. That is the opposite of depth. That is called breadth.
Depth means going over the details of an algorithm. Specifically:
Deriving each equation being used and the intuition behind them.
Deriving any algorithms that must be put into code.
Walking through the pseudocode.
Implementing the actual code (pseudocode is nice, but it can’t run on your computer).
This is called “depth” because it goes over every nook and cranny of an algorithm or model, leaving no stone unturned.
“Breadth” is the opposite. This means covering many different topics in a shallow fashion (the word “shallow” is the opposite of “deep”).
Typically that involves what I described in Example 1 (very brief explanation, a few lines of code to call an API).
Breadth is useless. I can obtain breadth for free by reading the Scikit-Learn documentation.
This is a MASSIVE (over 24 hours) Deep Learning course covering EVERYTHING from scratch. That includes:
Machine learning basics (linear neurons)
ANNs, CNNs, and RNNs for images and sequence data
Time series forecasting and stock predictions (+ why all those fake data scientists are doing it wrong)
NLP (natural language processing)
Transfer learning for computer vision
GANs (generative adversarial networks)
Deep reinforcement learning and applying it by building a stock trading bot
IN ADDITION, you will get some unique and never-before-seen VIP projects:
Estimating prediction uncertainty
Drawing the standard deviation of the prediction along with the prediction itself. This is useful for heteroskedastic data (that means the variance changes as a function of the input). The most popular application where heteroskedasticity appears is stock prices and stock returns – which I know a lot of you are interested in.
It allows you to draw your model predictions like this:
Sometimes, the data is simply such that a spot-on prediction can’t be made. But we can do better by letting the model tell us how certain it is in its predictions.
Facial recognition with siamese networks
This one is cool. I mean, I don’t have to tell you how big facial recognition has become, right? It’s the single most controversial technology to come out of deep learning. In the past, we looked at simple ways of doing this with classification, but in this section I will teach you about an architecture built specifically for facial recognition.
You will learn how this can work even on small datasets – so you can build a network that recognizes your friends or can even identify all of your coworkers!
You can really impress your boss with this one. Surprise them one day with an app that calls out your coworkers by name every time they walk by your desk. 😉
Please note: The VIP coupon will work only for the next month (ending May 1, 2020). It’s unknown whether the VIP period will renew after that time.
After that, although the VIP content will be removed from Udemy, all who purchased the VIP course will get permanent free access on deeplearningcourses.com.
This course is designed to be a beginner to advanced course. All that is required is that you take my free Numpy prerequisites to learn some basic scientific programming in Python. And it’s free, so why wouldn’t you!?
You will learn things that took me years to learn on my own. For many people, that is worth tens of thousands of dollars by itself.
There is no heavy math, no backpropagation, etc. Why? Because I already have courses on those things. So there’s no need to repeat them here, and PyTorch doesn’t use them. So you can relax and have fun. =)
All of my deep learning courses until now have been in Tensorflow (and prior to that Theano).
So why learn PyTorch?
Does this mean my future deep learning courses will use PyTorch?
In fact, if you have traveled in machine learning circles recently, you will have noticed that there has been a strong shift to PyTorch.
Case in point: OpenAI switched to PyTorch earlier this year (2020).
Major AI shops such as Apple, JPMorgan Chase, and Qualcomm have adopted PyTorch.
PyTorch is primarily maintained by Facebook (Facebook AI Research to be specific) – the “other” Internet giant who, alongside Google, have a strong vested interest in developing state-of-the-art AI.
But why PyTorch for you and me? (aside from the fact that you might want to work for one of the above companies)
As you know, Tensorflow has adopted the super simple Keras API. This makes common things easy, but it makes uncommon things hard.
With PyTorch, common things take a tiny bit of extra effort, but the upside is that uncommon things are still very easy.
Creating your own custom models and inventing your own ideas is seamless. We will see many examples of that in this course.
For this reason, it is very possible that future deep learning courses will use PyTorch, especially for those advanced topics that many of you have been asking for.
Because of the ease at which you can do advanced things, PyTorch is the main library used by deep learning researchers around the world. If that’s your goal, then PyTorch is for you.
In terms of growth rate, PyTorch dominates Tensorflow. PyTorch now outnumbers Tensorflow by 2:1 and even 3:1 at major machine learning conferences. Researchers hold that PyTorch is superior to Tensorflow in terms of the simplicity of its API, and even speed / performance!