This article will answer the question, “I meet the prerequisites, but I can’t understand your course. Why?”

The short answer:

You THINK you meet the prerequisites, but in REALITY, you do not.

Questions you should ask yourself:

Do I have any incentive to lie to you about the prerequisites? Why would I want to have a bunch of frustrated students who rate the course poorly?

Does it make sense for the student to decide whether or not they meet the prerequisites, or the instructor?

If you think “student”: That’s like saying a student of a driving school can decide whether or not they are ready to drive, which is clearly nonsense. The driving instructor decides whether or not you are ready! Obviously, driving would be very unsafe if we just let everyone decide for themselves when they are allowed to drive…

The long answer:

The simplest way to discuss this is by way of example. Let’s take my Linear Regression course.

It depends on 3 undergraduate / college-level math topics:

Calculus

Matrix arithmetic (I hesitate to say “linear algebra” because it doesn’t depend on most of the college-level topics of linear algebra, just high school-level matrix manipulation)

Probability

The matrix and vector arithmetic part should have already been covered in your high school math courses, and you should have applied those concepts already in Calculus 3, which typically covers vector calculus and some differential equations.

Not everyone is an A+ student

Let’s get this out of the way. Not everyone is an A+ student.

If you got a B or a C (or worse), there were many things you didn’t understand.

You passed the course, but it still means you didn’t understand 20% or more of the material.

That’s fine if you just need the course to progress in your degree.

That’s NOT fine if you need to apply the concepts in later courses.

If you got a D or an F, forget it. It doesn’t count at all.

So when I say “calculus is a prerequisite”, I don’t mean “you took a calculus class”, I mean you actually understand everything you learned and you’re able to apply it now.

If you got a B/C/D/F/whatever, act like a mature, responsible adult and just be proactive. Learn the skills you need to learn to get where you want to go, and stop blaming others for your insufficiencies.

To be very clear: this is not me being a hardass or being elitist.

I’m telling you to build these skills BEFORE taking this course, because you need these skills DURING the course.

Isn’t this just common sense?

“I did calculus 20 years ago and forgot everything, does that count?”

It should go without saying: no, it does not count.

Why?

You need to apply these concepts.

If you don’t know the concepts, you can’t apply them.

Many of the “older” generation like the idea of credentials.

“If I just get a piece of paper saying I know this subject, I’m good!”

No, you’re not good.

When I say calculus is a prerequisite, I don’t mean you need to show me a certificate saying you’ve learned calculus. (C’mon guys).

You need the skills to back it up.

The reason should be obvious.

“You shouldn’t assume students know <X>”

This one always gets me.

I always laugh when I see this and think, “yup, another person who can’t wrap their head around the concept of following instructions”.

Guys, the instructions are there.

They’re called the prerequisites.

Udemy students are notorious for not following instructions, which is why I’ve listed them TWICE in the course description.

On top of that, I mention it AGAIN in the lecture “How to succeed in this course”, which usually follows the introduction and outline lecture.

I mention them AGAIN in the FAQ, in the lectures “Is this course for beginners or experts?” and “How to succeed in this course (long version)”.

Yes, I really mention it that many times, because some students are that resistant to following them.

Really, it’s just an excuse for me to say: “I mentioned the prerequisites five times, and you still didn’t follow them?” 😉

There’s a difference between:

Assuming you know something and not giving you any warning

Assuming you know something because I TOLD YOU to know it and REMINDED you several times

I hope that the difference is obvious.

Example 1: Matrix Calculus

Let’s give some examples of students who think they meet the prerequisites, but really don’t.

A good example is matrix calculus, used a few times in my Linear Regression course (in a very basic way).

It’s meant to be an introductory course in machine learning and paves the way for all the other more advanced courses I teach (25+ so far).

Facts:

Matrix calculus is not a prerequisite of this course

Matrix calculus doesn’t need to be a prerequisite to this course

There is no “class” on matrix calculus that you can take

Some beginner students see the Matrix Cookbook and freak out. Why?

It’s not because I’ve failed to teach them “matrix calculus”.

It’s because they are not good at regular calculus! (And to be clear – that’s your fault).

Remember, after you’ve passed your calculus 1/2/3 classes, it’s all about applying what you learned in later courses.

Matrix calculus is essentially partial differentiation applied to the matrix and vector arithmetic you already learned in high school.

Some beginner students have asked me to show them the relevant rule in the matrix cookbook, or assume that there’s some “trick” to learning how to use this book.

No.

It’s a book. You’re supposed to have the skills to read a book.

Given a book, you should be able to look up relevant information.

These are basic life skills, man!

Example 2: Multivariate Normal Distribution and Probability

Probability is a tricky subject, because it’s taught at all levels (high school, college, and graduate school).

For machine learning, the most relevant level is college level. You can watch this video for more details: https://youtu.be/5Iq7tcrTnWA

It should be obvious in any case:

Obviously, I don’t mean high school probability, since machine learning is a 3rd-4th year subject

Obviously, I don’t mean graduate level probability, because most people don’t have graduate degrees (but again, graduate school comes after 4th year)

If your level of probability is: p(heads) = # heads / (# heads + # tails), that’s just not enough.

Seeing a multivariate normal PDF shouldn’t scare you.

If it does, then the correct course of action is NOT: “Hey, you haven’t explained this! BAD TEACHER! He needs to bring the course DOWN to MY level!”

The correct course of action IS: “Hmm, I wonder why I haven’t seen this before? Let me look it up and double check whether or not I meet the prerequisites. I’ll research this by myself so I can catch up to my fellow students”.

It’s your responsibility to look up the correct level of probability of this course (after I’ve made it clear), and it’s your responsibility to catch up on topics you don’t know.

It’s not my responsibility to teach you everything from scratch (a whole new course for free, are you kidding?)

Basic topics you should know:

PDFs, PMFs, CDFs

Common distributions: Bernoulli, Binomial, Poisson, Exponential, Normal, Multivariate Normal

CLT

Conditional distributions, Bayes’ rule

Expected values and functions of random variables

You do not need exposure to statistics concepts like maximum likelihood estimation or MAP estimation.

You should be good enough with probability to learn MLE and MAP as you take this course.

If you cannot, that means your skills in probability are not sufficient and you do not meet the prerequisites.

I know the topics you listed, but I still can’t understand

Many people will see these “lists” (like: PDFs, PMFs, and CDFs) and say, “yes, I know these topics”.

WRONG.

Is that good enough?

NO.

People often confuse:

Being exposed to a concept

Being able to solve problems using the concept

When I say “know PDFs”, I don’t just mean “know what a PDF is”, I mean, actually be able to do useful computations involving PDFs.

In other words:

Reading a Wikipedia page or watching a YouTube / Khan Academy video is NOT enough.

You must be able to solve problems and do math.

Example 3: Relationship between Squared Error, MLE, Regularization, and MAP

In my linear Regression course, we show the equivalence of squared error minimization and MLE, and the equivalence of regularized regression and MAP.

This only requires exponentiating the loss function to recognize that the “form” or “shape” of the loss is proportional to the likelihood (or posterior in the MAP case).

Some beginner students have trouble “seeing” the equivalence, even when the equations are presented in front of their very eyes.

If you can’t see this equivalence, again, it’s because your skills in probability are not sufficient.

It means you need to IMPROVE your math knowledge.

The big mistake students make is thinking that they are perfect – they don’t need to improve.

They believe others should be bending over backwards to make things easier for them.

This is a stupid, self-centered approach which only hurts them in the end.

By not recognizing that THEY lack skills (in many cases, insisting that it cannot be true!), they will never gain new skills.

So, I suppose they reap what they sow in the end.

What is the solution?

The solution is obvious, and I’ve alluded to it earlier in this article.

No, it’s not just “meet the prerequisites”.

See, you really have 2 options:

Blame OTHER PEOPLE for your insufficient prerequisite knowledge.

IMPROVE your knowledge to meet the prerequisites sufficiently.

Obviously, only one of these will actually make you a better, more knowledgeable person.

The type 1 people often say things like: “You should have explained X, Y, Z, etc.” presuming that their knowledge of the prerequisites is perfect.

They presumed that it must be MY prerequisite knowledge was wrong, because I didn’t know what they knew.

No, the course will not be customized exactly for your background.

I don’t know you. I can’t know what you know or don’t know.

What happens when you are in the REAL world?

Do you go to your college professor and say, “Hey man! You made this course too hard! I DEMAND that you help me review the prerequisites!”?

No, no you don’t do that.

It’s called “catching up”.

You work hard and “catch up” to your peers, so that you can pass the class.

You don’t ask for the rest of the class to wait for you.

In this article, we’re going to discuss what the word “appendix” means and why you see this section in most of my courses.

Firstly, I was surprised that so many students didn’t understand what an “appendix” is, because the concept already appears in many books. I guess some of my students are not reading many books…

This is actually why I renamed this section to “Appendix / FAQ” since I hoped that if students never heard of the word “appendix”, at least they must know what “FAQ” means.

Side note: “FAQ” stands for “frequently asked questions”.

Appendix: supplementary material usually attached at the end of a piece of writing

“Ok sure, but do you have any examples?”

Here is an example from one of the most famous machine learning books of all time (Bishop’s Pattern Recognition and Machine Learning):

I hope the concept is obvious (please let me know if it’s not obvious for you).

The purpose of an appendix is to provide material which is related but not part of the main content.

The main content of PRML covers topics like Linear Regression, K-Nearest Neighbor, Gaussian Mixture Models, etc.

Obviously, “Data Sets” and “Properties of Matrices” are not machine learning algorithms. But that doesn’t mean they shouldn’t belong in a machine learning book!

What do I cover in my Appendices?

Now that you have one example of an Appendix, let’s look at another: the one existing in my courses. What is covered? Examples:

Supplementary technical content to help students with common problems

Installing Python and Python libraries / environment setup

Instructions for coding exercises

Tips for differentiating Python 2 vs. Python 3

Why many beginners have the wrong idea about Jupyter Notebooks

How to succeed in the course (obviously very important!)

Machine Learning and AI Prerequisite Roadmap / What order should I take your courses in?

Where to get free material, sign up for my newsletter, etc.

I hope that you have the common sense to understand why these would be included in the course.

Why are these topics included in the Appendix?

The important thing to remember is that you are not the only person taking the course. In most cases, there are thousands of students in each course.

I’ve been teaching these courses for 6+ years now.

Obviously, I’ve grown familiar with the most common student questions and concerns.

Therefore, it’s convenient for me to address these questions and concerns before they ever reach my inbox.

This saves the student time (because they don’t end up frustrated and then emailing me) and it saves me time (because I don’t have to copy and paste the same response again and again).

Why would you be opposed to that?

Are you against me helping other students?

Why can’t you include it in a separate course?

This is the dumbest idea I’ve ever heard.

For some reason, students get their feelings hurt by the appendix sometimes. Why? I have no idea. More on that later…

Would you suggest to Christopher Bishop that he should make a separate book just about “Data Sets” and “Properties of Matrices”?

Do you think Appendices should be removed from all books and sold as separate books?

If you can’t answer “yes” to this question, then you have just realized your inconsistent and poor logical thinking.

You are being selfish

The main reason in my opinion that students dislike the appendix (if they know what “appendix” even means to begin with) is that they are selfish.

They want the course to be only for them, not for anyone else.

They don’t care if these lectures answer the most common questions.

They don’t care if students would have more difficulty and more obstacles if these questions weren’t answered in an obvious and convenient place (e.g. without having to go to a different website, URL, an entirely different course, or having to email me and wait for an answer).

It doesn’t harm you in any way

Maybe students really believe they are not being selfish.

Maybe students really believe that the existence of the appendix really hurts them.

My question is: why / how?

How does answering the questions of other students hurt you?

The course promises to teach Topics A, B, C.

I presume you took the course to learn Topics A, B, C.

The course has materials on Topics A, B, C.

Therefore, you got what you should have expected from the course: Topics, A, B, C.

The course has extra materials I’ve added to answer common student questions.

Adding extra materials doesn’t take away anything from you.

Does it?

As always, I have no idea why students can disagree with such obvious and logical common sense.

So, if you are one of those students, please just email me using the contact form above and explain your thinking. I am eager to listen!

“I thought the course was going to be 10 hours long but it only had 8 hours of main content!”

This is another one of the dumbest ideas I’ve heard.

Which is correct?

You took the course because it was 10 hours

You took the course because you want to learn Topic A, B, C.

The argument by some illogical students is that: they thought the course was 10 hours, but 2 hours of that was “only” appendix material.

Aside from the selfishness (see above) and the fact that the student was too incompetent to even read the course curriculum before signing up, this is STILL a very dumb idea.

Why?

I LOVE learning things fast.

Time is precious. I don’t have any to waste.

If you tell me that I can learn something 20% faster than I previously assumed, I would be overjoyed!

That’s great!

It means I learned what I expected to learn, in less time.

Can someone explain to me why the above doesn’t make perfect sense?

You think you don’t need the appendix, but you do

The reality is, some students are simply entitled brats.

I know it’s not professional to say that, but when have I ever claimed to be professional?

I prefer to tell the truth, even if it hurts your feelings.

I’ve lost count of how many times students complained about the appendix, yet:

Failed to use the Q&A (the #1 rule of “how to succeed in this course”)

Failed to do the exercises as instructed

Failed to set up their environment correctly

Failed to understand the prerequisites

Failed to understand the purpose of the course

Failed to understand the differences between Python 2 and 3

Failed to understand why they don’t need Jupyter Notebook to run Python code

And the list goes on…

If these students would have stopped complaining, and followed the instructions right in front of their very eyes, they would have fixed their understanding.

Yet they prefer to just complain and remain incompetent. Why?

One of the most common complaints I hear from students is: Why do I have to learn all this math? Why isn’t there a library to do what I want?

Someone recently made this proclamation to me: “You should explain that your courses are for college students, not industry professionals”.

This made me laugh very hard.

In this article, I will refer to students who make such proclamations as “ML wannabes” for lack of a better term, because people who actually do ML generally know better than this.

Choosing Between Academic and Professional is a False Dichotomy

Yes, even Geoffrey Hinton, Yann LeCun, and Yoshua Bengio have to choose between “academia” and “industry”.

But are they choosing between using Tensorflow vs. exploring the fundamental ideas in machine learning (which necessarily involves lots of theoretical thinking and math)?

No.

I think it’s clear that Geoffrey Hinton isn’t sitting there and saying, “screw all this math, let me just plug my data into Keras”.

Ok, but you and I are not Geoffrey Hinton. So what about us?

When ML wannabes say that my courses are for college students and not professionals, my immediate thought is: What kind of so-called “professionals” do you work with?

Are they fake professionals?

What do you think college students do after they graduate college?

I hope these questions aren’t too philosophical… it’s a pretty standard path: college students graduate college, then work as a professional.

Ergo, professionals are former college students. They have all the knowledge of a college student, and then some.

So isn’t it the case then that being a professional means that they are now experts at all this “math stuff”?

By that logic, shouldn’t it be the case that professionals are the best-equipped to learn all this “math stuff”?

AND vs OR

You are not choosing between having an understanding of the math behind machine learning and its practical application.

Being effective at applying machine learning practically involves having a base level of theoretical understanding.

Conversely, having a good understanding of machine learning in theory involves a base level of understanding of how it will be applied in the real world.

i.e. It’s “AND”, not “OR”. You don’t get to choose between these. If you miss one, you’ll be bad at the other. You need both.

Notice that I said a base level of understanding: you don’t have to do a PhD in statistical learning theory. In fact, I hate statistical learning theory.

Nothing I teach involves PhD-level math, so if you think that’s what it is, then you are overestimating everything, including your own skill. I always find it funny when students say “you need your PhD to do this math”. Actually, saying stuff like that just makes YOU look silly because you think you’re closer to it than you really are. In fact, it’s not “PhD math” you’re having trouble with, it’s just undergraduate math…

Even worse: the people who tell me they have a PhD and that’s why they know what they’re talking about. This is the funniest. You have a PhD and you admit that you still have trouble with undergraduate math? Isn’t that just making you look bad? Very silly indeed.

Professionals forgot what they learned in college

You maybe excused if it’s been 20 years since you graduated college and you’re not doing math or algorithms everyday.

Ok, but then what are you doing? Why did they hire you in the first place?

Again, saying all this stuff just makes YOU look bad.

All that stuff you learned in college went out the window?

You paid thousands of dollars and learned nothing useful whatsoever?

Well maybe you got a job that only requires doing database queries and building report dashboards all day.

You got comfortable.

No problem with that. It pays well. It’s steady. You don’t have to bang your head against the wall all the time.

But now what? What are you going to do about it?

Are you going to blame other people for your situation?

If I want to go back to being a professional tennis player after a few years off the courts due to injury, is it my job to train myself to back a world-class level, or is it my opponents’ job to go easy on me?

Coding Interviews

Coding interviews at Google, Facebook, Amazon, etc. are great examples of why, as a professional, you can’t simply forget everything you learned in college.

Not knowing “college material” doesn’t make you “not a college student”, it makes you a bad professional.

i.e. You’re supposed to know this stuff, and yet you don’t.

You can’t say “I’m not a college student”, because that doesn’t excuse you from not knowing this stuff anyway.

What about people who didn’t go to college? There are tons of stories out there about people who have worked their way up from scratch on their own. They learned what they had to in order to pass the coding interview.

They did not say, “since I am not a college student, the company must change their standards when they hire me”.

They did all the same work as a college student, and in fact, it is even more admirable that they taught themselves!

So as an ML Wannabe, realize that saying “I’m not a college student” is not an excuse, it’s merely equivalent to saying, “I’m not a good professional, I’m a professional that lacks knowledge”.

There is a funny contradiction when it comes to coding interviews:

The wannabes always say, “industry doesn’t care about academics”.

Then when it comes to these coding interviews that they have trouble with, they complain: “industry cares too much about academics!”

So which is it?

That is why professionals at these tech companies do such great work: they are not just professionals. They are professionals who apply what they learned in college on a daily basis.

More on that next.

How Google Came To Be

With hindsight, we can observe with great awe at how Google became the giant it is today.

It all started with a simple Markov model (the kind of math ML Wannabes try to avoid).

This Markov model was a model of links to webpages on the Internet, and this allowed the founders of Google to create the most powerful search engine the world had ever seen.

Of course there were engineering challenges as well. How do you find the eigenvalues of a matrix with over a million rows and columns?

“Aha!” You say. The Lazy Programmer is wrong. Clearly this is a practical engineering problem and not an academic one!

Sorry to say, but you are still wrong.

What library do you use to factor a million x million matrix? Oh right, one doesn’t exist.

Google didn’t just pioneer the mathematics of search, they also pioneered the field of big data which is a subset of distributed computing.

Why didn’t they just use the MapReduce library? Oh right, because they had to invent it first!

If you don’t believe that this is an “academic” subject, you can read the many papers Google has put out on its file system, global databases, etc.

I talk a lot about math in this article but another thing ML Wannabes really hate is programming things on their own. (They prefer to use libraries that involve just a few lines of code to get the job done quickly.)

So what happens when the library you want doesn’t exist?

Do you say screw it and move on to something else?

Well that’s what differentiates the leaders and the followers.

And surely one must ask themselves: is creating a billion dollar company practical?

Whether you agree with these hiring practices or not; one can’t dispute results.

You Live in a Bubble

You might say: “Everyone around me is a professional, and they would all disagree with YOU, Lazy Programmer!”

You live in a bubble. Everyone does.

It makes sense if you think about it.

Your company hired people of similar aptitude to be on your team to get a particular job done. You are surrounded by like-minded people.

Of course you are.

And if anyone disagreed with you, you probably wouldn’t be friends with them anyway.

The likelihood of you being surrounded by opposing viewpoints is small.

How would your team get anything done if you could never agree?

But you can’t make the assumption that whatever is in your immediate radius applies uniformly throughout the rest of the world.

There are tens of thousands of STEM undergraduates going into the workforce each year.

Do you think they just automatically forget their undergraduate training?

Is the past 4 years simply erased from their minds?

No – instead, they become these coveted professional-college student hybrids you so fear.

Choose Your Own Life; Choose Your Own Career

I get it.

At some point, you want to stop thinking so hard.

You want to have a family.

You want to start taking up other hobbies that do not involve being a geek.

I can’t say that won’t be me someday.

In that case, go for that cushy job where it’s a little easier and you get to use all the libraries you want and never have to think about calculus and graph algorithms.

There’s nothing wrong with that.

A comfortable life, a comfortable software developer salary…

This is an excellent goal to have in life.

But you can’t have your cake and eat it too.

If you want to be a real professional (and not a wannabe) then you have to put in the work.

I’m just a vessel for information. I take machine learning and bring it to you.

I did not invent machine learning. So if there’s math, there’s math because the guy who invented it used that math.

Don’t blame me.

If you want to do machine learning, then accept what machine learning is.

You can’t choose to do machine learning, and then refuse to do all the work that everyone else did.

What makes you feel so self-entitled that you think everyone else has to do it except you?

At the very least, you should be interested and enthusiasticabout gaining new knowledge – not actively trying to avoid it.

Whether you learn “top-down” or “bottom-up”, you’re going to have to answer the hard questions sooner or later.

If you choose this path, then it’s your job to make the journey.

As we all know, the near future is somewhat uncertain. With an invisible virus spreading around the world at an alarming rate, some experts have suggested that it may reach a significant portion of the population.

Schools may close, you may be ordered to work from home, or you may want to avoid going outside altogether. This is not fiction – it’s already happening.

There will be little warning, and as students of science and technology, we should know how rapidly things can change when we have exponential growth (just look at AI itself).

Have you decided how you will spend your time?

I find moments of quiet self-isolation to be excellent for learning advanced or difficult concepts – particularly those in machine learning and artificial intelligence.

To that end, I’ll be releasing several coupons today – hopefully that helps you out and you’re able to study along with me.

Despite the fact that I just released a huge course on Tensorflow 2, this course is more relevant than ever. You might take a course that uses batch norm, adam optimization, dropout, batch gradient descent, etc. without any clue how they work. Perhaps, like me, you find doing “batch norm in 1 line of code” to be unsatisfactory. What’s really going on?

And yes, although it was originally designed for Tensorflow 1 and Theano, everything has been done in Tensorflow 2 as well (you’ll see what I mean).

Cutting-Edge AI: Deep Reinforcement Learning in Python

A lot of people think SVMs are obsolete. Wrong! A lot of you students want a nice “plug-and-play” model that works well out of the box. Guess what one of the best models is for that? SVM!

Many of the concepts from SVMs are extremely useful today – like quadratic programming (used for portfolio optimization) and constrained optimization.

Constrained optimization appears in modern Reinforcement Learning, for you non-believers (see: TRPO, PPO).

Well, I don’t need to tell you how popular GANs are. They sparked a mini-revolution in deep learning with the ability to generate photo-realistic images, create music, and enhance low-resolution photos.

Variational autoencoders are a great (but often forgotten by those beginner courses) tool for understanding and generating data (much like GANs) from a principled, probabilistic viewpoint.

Ever seen those cool illustrations where they can change a picture of a person from smiling to frowning on a continuum? That’s VAEs in action!

This is one of my favorite courses. Every beginner ML course these days teaches you how to plug into scikit-learn.

This is trivial. Everyone can do this. Nobody will give you a job just because you can write 3 lines of code when there are 1000s of others lining up beside you who know just as much.

It’s so trivial I teach it for FREE.

That’s why, in this course (a real ML course), I teach you how to not just use, but implement each of the algorithms (the fundamental supervised models).

At the same time, I haven’t forgotten about the “practical” aspect of ML, so I also teach you how to build a web API to serve your trained model.

This is the eventual place where many of your machine learning models will end up. What? Did you think you would just write a script that prints your accuracy and then call it a day? Who’s going to use your model?

The answer is, you’re probably going to serve it (over a server, duh) using a web server framework, such as Django, Flask, Tornado, etc.

Never written your own backend web server application before? I’ll show you how.
Alright, that’s all from me. Stay safe out there folks!

Note: these coupons will last 31 days – don’t wait!