The complete PyTorch course for AI and Deep Learning has arrived

April 1, 2020

PyTorch: Deep Learning and Artificial Intelligence

VIP Promotion

The complete PyTorch course has arrived

Hello friends!

I hope you are all staying safe. Well, I’m sure you’ve heard enough about that so how about some different news?

Today, I am announcing the VIP version of my latest course: PyTorch: Deep Learning and Artificial Intelligence

[If you don’t want to read my little spiel just click here to get your VIP coupon:]

[The NEW VIP coupon for May 2 – June 2 2020 is:]

[The NEW VIP coupon for June 2 – July 3 2020 is:]

[The NEW VIP coupon for July 6 – August 6 2020 is:]

[The NEW VIP coupon for August 7 – September 7 2020 is:]

This is a MASSIVE (over 22 hours) Deep Learning course covering EVERYTHING from scratch. That includes:

  • Machine learning basics (linear neurons)
  • ANNs, CNNs, and RNNs for images and sequence data
  • Time series forecasting and stock predictions (+ why all those fake data scientists are doing it wrong)
  • NLP (natural language processing)
  • Recommender systems
  • Transfer learning for computer vision
  • GANs (generative adversarial networks)
  • Deep reinforcement learning and applying it by building a stock trading bot

IN ADDITION, you will get some unique and never-before-seen VIP projects:


Estimating prediction uncertainty

Drawing the standard deviation of the prediction along with the prediction itself. This is useful for heteroskedastic data (that means the variance changes as a function of the input). The most popular application where heteroskedasticity appears is stock prices and stock returns – which I know a lot of you are interested in.

It allows you to draw your model predictions like this:

Sometimes, the data is simply such that a spot-on prediction can’t be made. But we can do better by letting the model tell us how certain it is in its predictions.


Facial recognition with siamese networks

This one is cool. I mean, I don’t have to tell you how big facial recognition has become, right? It’s the single most controversial technology to come out of deep learning. In the past, we looked at simple ways of doing this with classification, but in this section I will teach you about an architecture built specifically for facial recognition.

You will learn how this can work even on small datasets – so you can build a network that recognizes your friends or can even identify all of your coworkers!

You can really impress your boss with this one. Surprise them one day with an app that calls out your coworkers by name every time they walk by your desk. 😉


Please note: The VIP coupon will work only for the next month (ending May 1, 2020). It’s unknown whether the VIP period will renew after that time.

After that, although the VIP content will be removed from Udemy, all who purchased the VIP course will get permanent free access on


Minimal Prerequisites

This course is designed to be a beginner to advanced course. All that is required is that you take my free Numpy prerequisites to learn some basic scientific programming in Python. And it’s free, so why wouldn’t you!?

You will learn things that took me years to learn on my own. For many people, that is worth tens of thousands of dollars by itself.

There is no heavy math, no backpropagation, etc. Why? Because I already have courses on those things. So there’s no need to repeat them here, and PyTorch doesn’t use them. So you can relax and have fun. =)


Why PyTorch?

All of my deep learning courses until now have been in Tensorflow (and prior to that Theano).

So why learn PyTorch?

Does this mean my future deep learning courses will use PyTorch?

In fact, if you have traveled in machine learning circles recently, you will have noticed that there has been a strong shift to PyTorch.

Case in point: OpenAI switched to PyTorch earlier this year (2020).

Major AI shops such as Apple, JPMorgan Chase, and Qualcomm have adopted PyTorch.

PyTorch is primarily maintained by Facebook (Facebook AI Research to be specific) – the “other” Internet giant who, alongside Google, have a strong vested interest in developing state-of-the-art AI.

But why PyTorch for you and me? (aside from the fact that you might want to work for one of the above companies)

As you know, Tensorflow has adopted the super simple Keras API. This makes common things easy, but it makes uncommon things hard.

With PyTorch, common things take a tiny bit of extra effort, but the upside is that uncommon things are still very easy.

Creating your own custom models and inventing your own ideas is seamless. We will see many examples of that in this course.

For this reason, it is very possible that future deep learning courses will use PyTorch, especially for those advanced topics that many of you have been asking for.

Because of the ease at which you can do advanced things, PyTorch is the main library used by deep learning researchers around the world. If that’s your goal, then PyTorch is for you.

In terms of growth rate, PyTorch dominates Tensorflow. PyTorch now outnumbers Tensorflow by 2:1 and even 3:1 at major machine learning conferences. Researchers hold that PyTorch is superior to Tensorflow in terms of the simplicity of its API, and even speed / performance!

Do you need more convincing?

Go to comments

New Exclusive Course: Linear Programming for Linear Regression in Python

July 14, 2020

If you’ve been to recently, you will have noticed that there is now a section for exclusive courses. These are courses that will *not* be on any other platforms, only

These are what I’ve been calling “mini-courses” during their development and that’s what they are in spirit. They are:

  • Lower cost
  • Shorter in duration

There won’t be any time spent on stuff like appendices which most of you have already seen and are mainly for beginners.

The point of these courses is to have a faster turn-around time on course development. Sometimes, there are topics I want to cover really quickly that won’t ever become a full-sized course. They will also be used to cover more advanced topics.

Unfortunately, a lot of students on other platforms (e.g. Udemy) are complete beginners who have no desire advance and gain actual skill. They take “marketer-taught” courses which leads to a complex which I call “confidence without ability”. Dealing with such students is draining.

These mini-courses will bring us back to the old days (many of you have been around since then!) where the material was more concise, straight-to-the-point, and didn’t need “beginner reminders” all over the place.

Given that these mini-courses are much simpler for me to make, I expect there to be many more in the future.

This first exclusive mini-course is on Linear Programming for Linear Regression.

Many students in my Linear Regression course often ask, “What if I want to use absolute error instead of squared error?” This course answers exactly that question and more.

The solution is based on Linear Programming (LP).

We will also cover 2 other common problems: maximum absolute deviation and positive-only (or negative-only) error.

These kinds of problems are often found in professional fields such as quantitative finance, operations research, and engineering.

Each of these problems can be solved using Linear Programming with the Scipy library.

BONUS FACT: I have a new pen and tablet set up so most of the derivations in this course are done by hand – really truly old-school like the Linear/Logistic Regression days!

Get the course here:


MATLAB for Students, Engineers, and Professionals in STEM

Another exclusive course which has already been on for some time is my original MATLAB course. This was the first course I ever made and is basically a collector’s item. The quality isn’t that great compared to what I am creating now, but obviously you will still learn a lot.

I’m including it in this newsletter to announce that I was able to dig up an extra section on probability that didn’t exist before. So the course now has 3 major sections:

  1. MATLAB basic operations and variables
  2. Signal processing with sound and images
  3. Probability and statistics

Get the course here:

Go to comments

Beginners: How to get an infinite amount of practice and exercise in machine learning

July 9, 2020

One of the most common questions I get from beginners in machine learning is, “how do I practice what I’ve learned?”

There are several ways to answer this.

First, let’s make an important distinction.

There’s a difference between putting in the work to understand an algorithm, and using that algorithm on data. We’ll call these the “learning phase” and the “application phase”.


Learning phase = Putting in the work to understand an algorithm
Application phase = Using that algorithm on data


Let’s take a simple example: linear regression.

In the learning phase, your tasks will include:

  • Being able to derive the algorithm from first principles (that’s calculus, linear algebra, and probability)
  • Implementing the algorithm in the language of your choice (it need not be Python)
  • Testing your algorithm on data to verify that it works

These are essential tasks in ensuring that you really understand an algorithm.

Doing these tasks are “exercises” which improve your general aptitude in machine learning, and will strengthen your ability to learn other algorithms in the future, such as logistic regression, neural networks, etc.

As my famous motto goes: “if you can’t implement it, then you don’t understand it”.

Interestingly, 5 years after I invented this motto, I discovered that the famous physicist Richard Feynman said a very similar thing!


In order to get an infinite amount of practice in this area, you should learn about various extensions on this algorithm, such as L1 and L2 regularization, using gradient descent instead of the closed-form solution, 2nd order methods, etc.

You might want to try implementing it in a different language. And finally, you can spend a lifetime exercising your ability to understand machine learning algorithms by learning about more machine learning algorithms in much the same way.

Believe me, 10 years down the line you may discover something new and interesting about even the simplest models like Linear Regression.


The second phase is the “application phase”.

Here is where your ability to exercise and practice is really infinite.

Let’s first remember that I don’t know you personally. I don’t know what you care about, what field you are in, or what your motivations for learning this subject are.

Therefore, I cannot tell you where to apply what you’ve learned: only you know that.

For example, if you are a computational biologist, then you can use this algorithm on problems specific to computational biology.

If you are a financial engineer, then you can use this algorithm on problems specific to financial engineering.

Of course, because I am not a computational biologist, I don’t know what that data looks like, what the relevant features are, etc.

I can’t help you with that.

The “interface” where I end and you begin is the algorithm.

After I teach you how and why the algorithm works and how to implement it in code, using it to further scientific knowledge in your own field of study becomes your responsibility.

One can’t expect me to be an expert computational biologist and an expert financial engineer and whatever else it is that you are an expert in.

Therefore, you can’t rely on me to tell you what datasets you might be interested in, what kinds of problems you’re trying to solve, etc.

Presumably, since you’re the expert, you should know that yourself!

If you don’t, then you are probably not the expert you think you are.

But therein lies the key.

Once you’ve decided what you care about, you can start applying what you’ve learned to those datasets.

This will give you an infinite amount of practice, assuming you don’t run out of things to care about.

If you don’t care about anything, well then, why are you doing this in the first place? Lol.

This also ties nicely into another motto of mine: “all data is the same”.

What does this mean?

Let’s recall the basic “pattern” we see when we use scikit-learn:

model = LinearRegression(), Y_train)

Does this code change whether (X, Y) represent a biology dataset or a finance dataset?

The answer is no!

Otherwise, no such library as Scikit-Learn could even exist!

“All data is the same” means that the same Linear Regression algorithm applies, no matter what field or what industry your data happens to come from.

There’s no such thing as “Linear Regression for biology” and “Linear Regression for finance”.

There’s only one linear regression that is the same linear regression no matter the dataset.

Thus, you learn the algorithm once, and you can apply it infinitely to any number of datasets!

Pretty cool huh?

But look, if you really have zero idea of what you care about, or your answer is “I care about machine learning”, then there are plenty of stock datasets that you can look up on your own.

These include Kaggle, the UCI repository, etc. There’s so much data out there, you will still have to pick and choose what to focus on first.

Again, you have to choose what you care about. Nobody else can possibly tell you that with any accuracy.



The “learning phase” above does not apply to situations where you’re learning an API (for example, Tensorflow 2, PyTorch, or even Scikit-Learn).


Well firstly, there’s nothing really to derive.

Secondly, it would be impossible for you to implement anything yourself without me showing you how first (at which point anything you type would amount to simply copying what I did).


Well, how would you know what to type if I didn’t show you?

Are you going to magically come up with the correct syntax for a library that you simply haven’t learned?

Obviously not. That would be a ludicrous idea.

In this case, the “learning phase” amounts to:

  • Understanding the syntax I’ve shown you
  • Being able to replicate that syntax on your own

This most closely represents what you will do in the “real-world”. In the real-world, you want to be able to write code fast and efficiently, rather than trying to remember which course and which lecture covered that exact syntax you’re thinking of.

Being able to write code on-the-fly makes you efficient and fast. Obviously, I don’t remember everything, but I know where to look when I need something. You have to find the right balance between memorizing and looking up. Just like how computer programs use caches and RAM for fast data retrieval instead of the hard drive.

Obviously, this will get better over time as you practice more and more.

It sounds overly simplistic, but it’s nothing more than repetition and muscle memory. I usually don’t explicitly commit to memorizing anything. I just write code and let it come naturally. The key is: I write code.

Obviously, watching me play tennis or reading books about tennis will not make you a better tennis player.

You must get on the court, pick up the tennis racket, and play actual tennis matches!

I can’t force you to do this. It’s the kind of thing which must be done of your own volition.

I mean, if you want to pay me consulting hours to call you and remind you to practice, I’d be very happy to oblige. =)

At this point, once you have completed the “learning phase” of learning an API, then the “application phase” described above still applies.

Go to comments

Data Science Interview Questions: Why “Logits” in Deep Learning Cross-Entropy Loss?

March 27, 2020

In this Data Science Interview Questions series, we’re going to answer the question:

Why do deep learning libraries have functions like “softmax_cross_entropy_with_logits v2”?

Why can’t we just use the formulas we learned in class?

What do these functions do and how?

Click to watch the video below:

Go to comments

Machine Learning: College Student vs. Industry Professional? Academic Study vs. Business Impact?

March 10, 2020

One of the most common complaints I hear from students is: Why do I have to learn all this math? Why isn’t there a library to do what I want?

Someone recently made this proclamation to me: “You should explain that your courses are for college students, not industry professionals”.

This made me laugh very hard.

In this article, I will refer to students who make such proclamations as “ML wannabes” for lack of a better term, because people who actually do ML generally know better than this.


Choosing Between Academic and Professional is a False Dichotomy

Yes, even Geoffrey Hinton, Yann LeCun, and Yoshua Bengio have to choose between “academia” and “industry”.

But are they choosing between using Tensorflow vs. exploring the fundamental ideas in machine learning (which necessarily involves lots of theoretical thinking and math)?


I think it’s clear that Geoffrey Hinton isn’t sitting there and saying, “screw all this math, let me just plug my data into Keras”.


Ok, but you and I are not Geoffrey Hinton. So what about us?

When ML wannabes say that my courses are for college students and not professionals, my immediate thought is: What kind of so-called “professionals” do you work with?

Are they fake professionals?

What do you think college students do after they graduate college?

I hope these questions aren’t too philosophical… it’s a pretty standard path: college students graduate college, then work as a professional.

Ergo, professionals are former college students. They have all the knowledge of a college student, and then some.

So isn’t it the case then that being a professional means that they are now experts at all this “math stuff”?

By that logic, shouldn’t it be the case that professionals are the best-equipped to learn all this “math stuff”?



You are not choosing between having an understanding of the math behind machine learning and its practical application.

Being effective at applying machine learning practically involves having a base level of theoretical understanding.

Conversely, having a good understanding of machine learning in theory involves a base level of understanding of how it will be applied in the real world.

i.e. It’s “AND”, not “OR”. You don’t get to choose between these. If you miss one, you’ll be bad at the other. You need both.

Notice that I said a base level of understanding: you don’t have to do a PhD in statistical learning theory. In fact, I hate statistical learning theory.

Nothing I teach involves PhD-level math, so if you think that’s what it is, then you are overestimating everything, including your own skill. I always find it funny when students say “you need your PhD to do this math”. Actually, saying stuff like that just makes YOU look silly because you think you’re closer to it than you really are. In fact, it’s not “PhD math” you’re having trouble with, it’s just undergraduate math…

Even worse: the people who tell me they have a PhD and that’s why they know what they’re talking about. This is the funniest. You have a PhD and you admit that you still have trouble with undergraduate math? Isn’t that just making you look bad? Very silly indeed.


Professionals forgot what they learned in college

You may be excused if it’s been 20 years since you graduated college and you’re not doing math or algorithms everyday.

Ok, but then what are you doing? Why did they hire you in the first place?

Again, saying all this stuff just makes YOU look bad.

All that stuff you learned in college went out the window?

You paid thousands of dollars and learned nothing useful whatsoever?

Well maybe you got a job that only requires doing database queries and building report dashboards all day.

You got comfortable.

No problem with that. It pays well. It’s steady. You don’t have to bang your head against the wall all the time.

But now what? What are you going to do about it?

Are you going to blame other people for your situation?

If I want to go back to being a professional tennis player after a few years off the courts due to injury, is it my job to train myself to back a world-class level, or is it my opponents’ job to go easy on me?


Coding Interviews

Coding interviews at Google, Facebook, Amazon, etc. are great examples of why, as a professional, you can’t simply forget everything you learned in college.

Not knowing “college material” doesn’t make you “not a college student”, it makes you a bad professional.

i.e. You’re supposed to know this stuff, and yet you don’t.

You can’t say “I’m not a college student”, because that doesn’t excuse you from not knowing this stuff anyway.

What about people who didn’t go to college? There are tons of stories out there about people who have worked their way up from scratch on their own. They learned what they had to in order to pass the coding interview.

They did not say, “since I am not a college student, the company must change their standards when they hire me”.

They did all the same work as a college student, and in fact, it is even more admirable that they taught themselves!

So as an ML Wannabe, realize that saying “I’m not a college student” is not an excuse, it’s merely equivalent to saying, “I’m not a good professional, I’m a professional that lacks knowledge”.


There is a funny contradiction when it comes to coding interviews:

The wannabes always say, “industry doesn’t care about academics”.

Then when it comes to these coding interviews that they have trouble with, they complain: “industry cares too much about academics!”

So which is it?

That is why professionals at these tech companies do such great work: they are not just professionals. They are professionals who apply what they learned in college on a daily basis.

More on that next.


How Google Came To Be

With hindsight, we can observe with great awe at how Google became the giant it is today.

It all started with a simple Markov model (the kind of math ML Wannabes try to avoid).

This Markov model was a model of links to webpages on the Internet, and this allowed the founders of Google to create the most powerful search engine the world had ever seen.

Of course there were engineering challenges as well. How do you find the eigenvalues of a matrix with over a million rows and columns?

“Aha!” You say. The Lazy Programmer is wrong. Clearly this is a practical engineering problem and not an academic one!

Sorry to say, but you are still wrong.

What library do you use to factor a million x million matrix? Oh right, one doesn’t exist.

Google didn’t just pioneer the mathematics of search, they also pioneered the field of big data which is a subset of distributed computing.

Why didn’t they just use the MapReduce library? Oh right, because they had to invent it first!

If you don’t believe that this is an “academic” subject, you can read the many papers Google has put out on its file system, global databases, etc.

I talk a lot about math in this article but another thing ML Wannabes really hate is programming things on their own. (They prefer to use libraries that involve just a few lines of code to get the job done quickly.)

So what happens when the library you want doesn’t exist?

Do you say screw it and move on to something else?

Well that’s what differentiates the leaders and the followers.

And surely one must ask themselves: is creating a billion dollar company practical?

Whether you agree with these hiring practices or not; one can’t dispute results.


You Live in a Bubble

You might say: “Everyone around me is a professional, and they would all disagree with YOU, Lazy Programmer!”

You live in a bubble. Everyone does.

It makes sense if you think about it.

Your company hired people of similar aptitude to be on your team to get a particular job done. You are surrounded by like-minded people.

Of course you are.

And if anyone disagreed with you, you probably wouldn’t be friends with them anyway.

The likelihood of you being surrounded by opposing viewpoints is small.

How would your team get anything done if you could never agree?

But you can’t make the assumption that whatever is in your immediate radius applies uniformly throughout the rest of the world.

There are tens of thousands of STEM undergraduates going into the workforce each year.

Do you think they just automatically forget their undergraduate training?

Is the past 4 years simply erased from their minds?

No – instead, they become these coveted professional-college student hybrids you so fear.



Choose Your Own Life; Choose Your Own Career

I get it.

At some point, you want to stop thinking so hard.

You want to have a family.

You want to start taking up other hobbies that do not involve being a geek.

I can’t say that won’t be me someday.

In that case, go for that cushy job where it’s a little easier and you get to use all the libraries you want and never have to think about calculus and graph algorithms.

There’s nothing wrong with that.

A comfortable life, a comfortable software developer salary…

This is an excellent goal to have in life.

But you can’t have your cake and eat it too.

If you want to be a real professional (and not a wannabe) then you have to put in the work.

I’m just a vessel for information. I take machine learning and bring it to you.

I did not invent machine learning. So if there’s math, there’s math because the guy who invented it used that math.

Don’t blame me.

If you want to do machine learning, then accept what machine learning is.

You can’t choose to do machine learning, and then refuse to do all the work that everyone else did.

What makes you feel so self-entitled that you think everyone else has to do it except you?


At the very least, you should be interested and enthusiastic about gaining new knowledge – not actively trying to avoid it.

Whether you learn “top-down” or “bottom-up”, you’re going to have to answer the hard questions sooner or later.

If you choose this path, then it’s your job to make the journey.

Don’t expect others to carry you.

Go to comments

Deep Learning and Artificial Intelligence Newsletter

Get discount coupons, free machine learning material, and new course announcements