Why you must learn what you’re coding before writing code

April 25, 2020

A very common beginner question is: “Why do I have to learn all this theory? Can’t we just jump straight to the code?”

Short answer: OH GOD NO.

One common misunderstanding is people often say, “Yes, but I’m a professional, I just want to learn what to do, I don’t care about why it works.”

This is totally not professional at all. Any engineering manager or coworker will regard you with strong suspicion if you have no idea what you are coding, and only copying something from an online course.

This is not a professional attitude. If I interviewed you, and you told me this was your approach, you will NOT be hired. If anyone I know interviews you, you will NOT be hired.

Sorry, but as part of a business that wants to make money, there is no place for such a lazy and incompetent attitude on my workforce.

Simply put: I’d rather pay someone who knows what they are doing.

The problem with such employees is they “don’t know what they don’t know”. They believe knowing any theory is useless (of course, those beliefs are wrong, because their lack of knowledge prevents them from understanding why that knowledge would be useful).

It would be like trying to tell LeBron James how to improve his basketball when you are not yourself a professional basketball player.

When it comes to pro basketball, you also “don’t know what you don’t know”. Maybe you can guess what it takes to become good, but you really have no idea.

Another way of saying they “don’t know what they don’t know” is they are “unconsciously incompetent”. They don’t even know that they are incompetent, which is why they actively try to deny it.

When you have “unconscious incompetence”, you are incompetent, but you don’t know that you are.

This is the lowest stage of competence in the “4 stages of competence” (source: https://en.wikipedia.org/wiki/Four_stages_of_competence).



Others might say: “Yes, but I already know the theory, I just want to learn the code”. This is illogical.

And usually it is a lie.

If they already knew the theory, then they wouldn’t have had so much trouble with it. 

If you just wanted to learn some code, why don’t you just copy one of the thousands of examples on Github?

If you already know the theory, then understanding the code should be easy (unless of course, you don’t understand it as well as you think you do!)

What’s the purpose of taking a course?

“Because I have X, Y, Z specific questions about the code.”

No, no course will answer your specific questions.

I hate to give you the bad news, but: no course is personalized to you.

Courses are for a general audience. In my case, courses will have thousands of students.

Courses cannot, and should not, be personalized to your specific scenario. That is absurd.

If you need specific questions answered, then the proper course of action is to hire a consultant or a tutor.

If you’re taking a course, then be ready for that course to be geared towards a general audience, not your specific questions and concerns.

As such, the most general format for learning a machine learning algorithm or topic is:

Step 1) Learn the “theory” behind the algorithm

Step 2) Implement that algorithm using computer code

Go to comments

“What if I’m forgetful? Can you remind me?”

April 18, 2020

I always laugh when I hear this question.

It’s very funny to me, because it presumes that I’m able to predict what you will forget and when.

Guys, I hate to break it to you, but… I have no psychic powers (at least, not that I’m aware of).

What do I always recommend, again and again?

See: “How to succeed in this course” (a lecture included in all of my courses).

It’s to: take notes.

If you’re not taking notes, you are doing it wrong.

See my post about taking hand-written notes here: https://lazyprogrammer.me/taking-hand-written-notes/

When was the last time you took a college class and didn’t take notes? (a class that wasn’t a bird course)


Don’t treat online learning any differently.

And you know, I actually do remind people of things.

But you know what happens then?

Then I have other students who complain that I’ve given them too many reminders!

I really wish I could get all my students in one room to argue amongst themselves about who is right. 🙂

Go to comments

What Does “In-Depth” Mean?

April 15, 2020

In this short article, I am going to give a short description of what “in-depth” means, as it pertains to courses on machine learning and data science.

To do this, I will provide several examples.

Example 1) Typical Udemy Course
Example 2) Typical College (“Academic”) Course
Example 3) My Courses

Example 1: Typical Udemy Course

A typical Udemy course on machine learning spends ~20 minutes talking about how “cool” and “powerful” an algorithm is (useless).

Then, the instructor spends ~20 minutes talking about “intuition” (maybe a few pictures here and there). Importantly, no math is used, no pseudocode is presented.

Then, the instructor spends ~30-45 minutes explaining 3 lines of code:

model = MyModel()
model.fit(X, Y)

Why is this bad?

  • You still have no idea how the model really works. Your “intuition” is often wrong.
  • Sometimes, even the instructor himself is wrong! (That’s because most instructors teaching these courses are marketers, not practicing data scientists.) They cannot teach the math or algorithms parts, because they don’t understand it themselves.
  • Look at the code above. Is it even related to any algorithm? This code has no relation to understanding how the model works. As such, it is useless for learning “machine learning”.
  • You can’t implement the model. As I always say, “If you can’t implement it, then you don’t understand it”. Or, as the famous physicist Richard Feynman once said, “What I cannot create, I do not understand”.
  • You just paid someone to learn just 3 lines of code and some flakey “intuition”. You could have done that for FREE! Check the Scikit-Learn documentation.

Example 2: Typical College / University Course

If I only had a choice between #1 and #2, I would usually choose #2. Despite the problems listed below, a college course on ML is far superior to the typical “for dummies” Udemy course.

The typical college course proceeds as follows:

Show all the math and pseudocode (usually), but go through them at super speed! Usually, each slide contains like 3-5 equations (way too much packed onto one slide).

Good Example:


Luckily, college ML courses do not use Scikit-Learn (unlike typical Udemy courses). Students do get to practice implementing algorithms (good).

Unfortunately, because the class moves so quickly, they will not get to implement everything. Usually, each lecture will cover 1 or 2 algorithms (suppose there’s 12 lectures per semester, so you will learn ~12-20 different algorithms in the whole course – too many too fast in my opinion).

But, since you can’t have 20 homework assignments (don’t get any crazy ideas professors!), you will likely only get to implement a few (e.g. logistic regression, k-means clustering, neural networks).

The end result: the course goes so fast, that students end up learning very little. They have some idea how each algorithm works, but not in-depth (even though all the equations were shown, it was simply too fast).

Another problem: Because implementation is only a homework assignment, you never get to see a reference implementation. As you probably know, not everyone is an A+ student. Why?

Well, you lose marks when you get things wrong. Students coding these assignments, if they get a B or a C, it means they got some things wrong. However, they passed the course. But they never learned to code the algorithm correctly.

So even if you “pass the course”, you are still missing knowledge!

Example 3: My Courses

In my courses, I aim to fix these shortcomings of both the typical Udemy course and the typical academic course.

How? By going in-depth into each subject.


  • Don’t cover every machine learning algorithm under the sun. Focus on one or a few algorithms per course.
  • For each algorithm, derive any equations used from first principles (typically, that means calculus, linear algebra, and probability, unless the course had other prerequisites)
  • For each algorithm, derive the “algorithmic” part / pseudocode from first principles
  • For each algorithm, implement it in code from scratch. This requires true understanding. It forces you to understand the equations and not skip over the details. It forces you to think about how the algorithm works, rather than just using flawed “intuition”.

Advantage #1: Anyone should be able to fully understand the material as long as they meet the prerequisites (usually basic calculus, linear algebra, probability, and programming). This is because everything is built up logically from these basic first principles.

Advantage #2: Because everything is derived from first principles, it’s unlike a typical academic course, where too many equations are shown all at once, and it’s not clear how the flow of logic proceeds unless you go through them slowly by yourself. Instead, each equation is shown one-by-one.

If you don’t understand it, then simply pause the video, and review it again, or use the Q&A to clarify your misunderstanding.

In a college class, things go by so fast you rarely have the opportunity to even think of the right question to ask before the lecturer moves on to the next subject.

Advantage #3: You get to implement everything from scratch. This means you have true understanding. Unlike the typical academic course, where implementation is just homework, I am going to show you a reference example of the code. So even if you get it wrong, you still have a chance to fix your mistakes.

So what does “in-depth” mean?

It means:

  1. Deriving all the math (unlike Udemy courses)
  2. Deriving all the algorithms and pseudocode (unlike Udemy courses)
  3. Showing the implementation of the code (unlike academic courses)

Common Beginner Mistake

Many beginners confuse the word “depth” with “breadth“.

For example, a beginner may come across a 20-40h course and proclaim, “This course covers so many topics! It is very in-depth!”

In actuality, it’s the opposite of deep. It’s shallow.

Spending 10 hours on basic Python, 1 hour on linear regression, 1 hour on logistic regression, etc. etc. means you only learned very basic stuff.

You didn’t learn anything “deep”. You only have superficial understanding.

Beginners often think that to be “in-depth”, a course must cover many different algorithms. That is the opposite of depth. That is called breadth.

Depth means going over the details of an algorithm. Specifically:

  • Deriving each equation being used and the intuition behind them.
  • Deriving any algorithms that must be put into code.
  • Walking through the pseudocode.
  • Implementing the actual code (pseudocode is nice, but it can’t run on your computer).

This is called “depth” because it goes over every nook and cranny of an algorithm or model, leaving no stone unturned.

“Breadth” is the opposite. This means covering many different topics in a shallow fashion (the word “shallow” is the opposite of “deep”).

Typically that involves what I described in Example 1 (very brief explanation, a few lines of code to call an API).

Breadth is useless. I can obtain breadth for free by reading the Scikit-Learn documentation.

Go to comments

The complete PyTorch course for AI and Deep Learning has arrived

April 1, 2020

PyTorch: Deep Learning and Artificial Intelligence

VIP Promotion

The complete PyTorch course has arrived

Hello friends!

I hope you are all staying safe. Well, I’m sure you’ve heard enough about that so how about some different news?

Today, I am announcing the VIP version of my latest course: PyTorch: Deep Learning and Artificial Intelligence

[If you don’t want to read my little spiel just click here to get your VIP coupon: https://www.udemy.com/course/pytorch-deep-learning/?couponCode=PYTORCHVIP]

https://www.udemy.com/course/pytorch-deep-learning/?couponCode=PYTORCHVIP25 (expires May 25, 2022)

This is a MASSIVE (over 24 hours) Deep Learning course covering EVERYTHING from scratch. That includes:

  • Machine learning basics (linear neurons)
  • ANNs, CNNs, and RNNs for images and sequence data
  • Time series forecasting and stock predictions (+ why all those fake data scientists are doing it wrong)
  • NLP (natural language processing)
  • Recommender systems
  • Transfer learning for computer vision
  • GANs (generative adversarial networks)
  • Deep reinforcement learning and applying it by building a stock trading bot

IN ADDITION, you will get some unique and never-before-seen VIP projects:

Estimating prediction uncertainty

Drawing the standard deviation of the prediction along with the prediction itself. This is useful for heteroskedastic data (that means the variance changes as a function of the input). The most popular application where heteroskedasticity appears is stock prices and stock returns – which I know a lot of you are interested in.

It allows you to draw your model predictions like this:

Sometimes, the data is simply such that a spot-on prediction can’t be made. But we can do better by letting the model tell us how certain it is in its predictions.

Facial recognition with siamese networks

This one is cool. I mean, I don’t have to tell you how big facial recognition has become, right? It’s the single most controversial technology to come out of deep learning. In the past, we looked at simple ways of doing this with classification, but in this section I will teach you about an architecture built specifically for facial recognition.

You will learn how this can work even on small datasets – so you can build a network that recognizes your friends or can even identify all of your coworkers!

You can really impress your boss with this one. Surprise them one day with an app that calls out your coworkers by name every time they walk by your desk. 😉

Please note: The VIP coupon will work only for the next month (ending May 1, 2020). It’s unknown whether the VIP period will renew after that time.

After that, although the VIP content will be removed from Udemy, all who purchased the VIP course will get permanent free access on deeplearningcourses.com.

Minimal Prerequisites

This course is designed to be a beginner to advanced course. All that is required is that you take my free Numpy prerequisites to learn some basic scientific programming in Python. And it’s free, so why wouldn’t you!?

You will learn things that took me years to learn on my own. For many people, that is worth tens of thousands of dollars by itself.

There is no heavy math, no backpropagation, etc. Why? Because I already have courses on those things. So there’s no need to repeat them here, and PyTorch doesn’t use them. So you can relax and have fun. =)

Why PyTorch?

All of my deep learning courses until now have been in Tensorflow (and prior to that Theano).

So why learn PyTorch?

Does this mean my future deep learning courses will use PyTorch?

In fact, if you have traveled in machine learning circles recently, you will have noticed that there has been a strong shift to PyTorch.

Case in point: OpenAI switched to PyTorch earlier this year (2020).

Major AI shops such as Apple, JPMorgan Chase, and Qualcomm have adopted PyTorch.

PyTorch is primarily maintained by Facebook (Facebook AI Research to be specific) – the “other” Internet giant who, alongside Google, have a strong vested interest in developing state-of-the-art AI.

But why PyTorch for you and me? (aside from the fact that you might want to work for one of the above companies)

As you know, Tensorflow has adopted the super simple Keras API. This makes common things easy, but it makes uncommon things hard.

With PyTorch, common things take a tiny bit of extra effort, but the upside is that uncommon things are still very easy.

Creating your own custom models and inventing your own ideas is seamless. We will see many examples of that in this course.

For this reason, it is very possible that future deep learning courses will use PyTorch, especially for those advanced topics that many of you have been asking for.

Because of the ease at which you can do advanced things, PyTorch is the main library used by deep learning researchers around the world. If that’s your goal, then PyTorch is for you.

In terms of growth rate, PyTorch dominates Tensorflow. PyTorch now outnumbers Tensorflow by 2:1 and even 3:1 at major machine learning conferences. Researchers hold that PyTorch is superior to Tensorflow in terms of the simplicity of its API, and even speed / performance!

Do you need more convincing?

Go to comments

Deep Learning and Artificial Intelligence Newsletter

Get discount coupons, free machine learning material, and new course announcements