Common-Sense Tips for Beginners

On this page, I intend to share some common-sense tips that sometimes elude many beginners. I have learned over the years that many beginners tend to face the same issues, and thus, these tips are intended to help those who don’t yet have a good grasp of how to “learn” from a course.

If you think any of the facts I present below are incorrect, please let me know at my Suggestion Box.


Table of Contents:

  1. Why don’t you update your courses to use the latest deep learning libraries (Tensorflow 2, etc.)?
  2. Why don’t you use Jupyter Notebook?
  3. The appendix is useless and it deceived me into thinking the course was longer
  4. Why isn’t the appendix at the front of the course instead?
  5. Should you code along?
  6. But I want to code along!
  7. What do you mean by “all data is the same”?
  8. I need to learn this FAST, why are you slowing me down by making me do all this math?
  9. Stop telling me to meet the prerequisites!
  10. Please just explain all the steps clearly, you are missing so many steps!
  11. I’m your customer, therefore you must do as I demand
  12. Why do we keep reinventing the wheel? Professionals use APIs!
  13. I came to your course after reading 10 books about X topic, I wanted to learn something new, but it offered nothing
  14. Why don’t you implement algorithms in stages, like a website or app?


Why don’t you update your courses to use the latest deep learning libraries (Tensorflow 2, etc.)?

There many many good reasons.

1) It’s simply not the standard. For example, do a search for iOS courses. You will see one course for iOS 9, one course for iOS 10, etc. from the same author. The author doesn’t simply keep updating the same old course.

Why hold only me to a unique standard?

2) Often, it doesn’t make sense to write updated code for very old algorithms (example: Restricted Boltzmann Machines). What’s the point of re-writing code for an algorithm invented 12 years ago when nobody is using it anymore?

3) You aren’t following my instructions to code by yourself. Please watch “Where to get the code”, “How to code by yourself”, and “Should you code along?” ( for more info.

I’ve had students who implemented the exercises in Java. You simply have no excuse for not coding by yourself and sitting there like a couch potato.

If you are competent in whatever library you use (whether that’s PyTorch, MXNet, or Tensorflow 2), then you should be able to take an algorithm and implement it from scratch.

Copying from others is the lowest form of learning.

I translate code from different libraries all the time. It’s a necessary skill in machine learning. If I’m looking for an implementation of an algorithm and it’s implemented in a different library, I consider that a gift.

It makes implementing things faster.

That means I don’t need to debug any assumptions I made about the algorithm as it was described in words, I can simply look at the reference implementation.

One great example of that (from my own personal experience) is GloVe (a text embedding algorithm).

I was able to use the original GloVe code (written in C), to help me with my Python implementation.

Why didn’t I just implement it directly from the paper, you ask? Because papers are notorious for missing critical details and being ambiguous. The C code helped me clarify those ambiguities and missing details (otherwise, I would have had to guess, which would have led to suboptimal results).

This is a great example of how code written using an “ancient” language was a gift.

If you can’t take a reference implementation and translate it into your own language of choice, then you simply are not competent in that language. It’s your responsibility to fix that.



Why don’t you use Jupyter Notebook / Spyder / PyCharm / etc.?

It doesn’t matter what I use – Python code is the same no matter what the environment.

Remember, you are watching a video.

Therefore, whatever environment I use is irrelevant. You are just looking at the code.

The code will be the same, no matter what.

If you want to use Jupyter Notebook, then use it.

Nothing is preventing you from copying & pasting my code into Jupyter Notebook.

If you are confused about this, you are probably too inexperienced with Python for my courses anyway – so my suggestion is to first learn Python and improve your Python skills.

Furthermore, there are many reasons why not to use Jupyter notebook.

E.g. Tracking changes in Git is horrendous.

E.g. Running an entire HTTP server on your computer just to run some Python code is slow.

E.g. Often running code in Jupyter notebook is slower.

E.g. Constant kernel crashes (need I say more?)


The appendix is useless and it deceived me into thinking the course was longer

Firstly: if you are not well-read, perhaps you didn’t know that the “FAQ” part of the section titled “Appendix / FAQ”, is an acronym which stands for “Frequently Asked Questions”.

That means, these lectures answer questions students ask me on a daily basis (i.e. frequently), and hence, they are the opposite of useless.

Many students have thanked and congratulated me for this content in the past.

If you don’t like it: who cares?

The course wasn’t designed for you. It was designed for everybody.

Surely, you would not suggest that I remove content that others have specifically asked me for?

That sounds pretty selfish…

It deceived you into thinking the course was longer?

Let me be clear: this is totally your fault.

One should not blame their lack of due diligence on others. Take responsibility for your own mistakes, as a mature adult should.

Clearly, each section and each lecture is listed in the course description, free to view before signing up.

The duration (length) of each section / lecture is also free to view.

If one is dissatisfied with the curriculum at that time, one should not register.

However, many students make the mistake of thinking the Appendix is not for them.

Often, many students who have complained to me about the Appendix have also complained about things which were already answered in the Appendix!

So perhaps, instead of dismissing it and assuming the content isn’t useful, perhaps you should start actually paying attention.


Why isn’t the appendix at the front of the course instead?

The people who ask this question are the complete opposite of the previous question, which actually further proves my point in the previous question. =)

If you are not well-read, perhaps you are unaware that “appendices” are, as a standard, placed at the end of a book.

They contain content which is not necessary to read the book from front-to-back, but rather, serve as a reference one can go to while they are reading the book, in case they are missing background knowledge.

For example, in a book about Machine Learning, the appendix may cover some elementary theorems from Linear Algebra and Probability. So if one doesn’t remember something about Linear Algebra, they may flip to the back of the book to review, and then go back to what they were reading.

Similarly, my courses contain background knowledge in the appendix.

It would obviously be very silly to spend 2 hours reminding students to code by themselves, the difference between “academic” and “professional”, etc. before even getting to the content of the course.

The students who asked the previous question would be very displeased!

But let me be clear: the Appendix contains material that you should already know.

I shouldn’t have to explain how to install Python and Python libraries at this stage of the game.

It’s kind of like teaching students in an English literature class how to spell.

If you can’t spell, you probably aren’t ready for English literature.

Yet, I include these lectures out of the hope that you will use them to catch up in your own time (hence, background knowledge).


Should you code along?

Although when I first created my courses many years ago I started with this approach (it seemed to be common practice), I’ve come to realize how bad it is.

When you are implementing a complex algorithm, it does not take you 15-20 minutes as a “coding lecture” might suggest. This gives you the completely wrong idea of how coding is done.

I’d recommend watching my lecture “Should you code along?” for a more in-depth discussion:


But I want to code-along!

Ok, if you decide to go against my advice and code-along, what’s stopping you?

Why do I need to type in order for you to type?

This is very silly. You are watching a video.

There’s no need for me to type.

You can type if you like.

I have nothing to do with this.

Use the “pause”, “play”, and “speed” buttons of the video player as necessary. You shouldn’t need me to tell you this.

This is so basic, I am surprised I even need to explain this.

You are (hopefully), an autonomous, healthy functioning adult. You don’t need my permission to type code…


What do you mean by “all data is the same”?

Many people have the wrong idea about what it means to “learn” machine learning.

If you want to “learn” Logistic Regression, you do not need to plug in 10 different datasets into Scikit-Learn with the same 3 lines of code.

That has nothing to do with Logistic Regression.

What we mean by “all data is the same”, is that the same “Logistic Regression algorithm” is applied in all of those 10 different examples.

The algorithm doesn’t change. Only the data changed.

That’s why a library like Scikit-Learn can exist in the first place. The same algorithm applies, no matter the dataset!

To learn Logistic Regression means learning about the algorithm.

Not learning about the data.

If you plugged in 10 different datasets into Scikit-Learn, you didn’t learn 10 things.

You just repeated one thing 10 times…

To really learn an algorithm includes how it works mathematically, and implementing it in code (note: using Scikit-Learn is not “implementing”).


I need to learn this FAST, why are you slowing me down by making me do all this math?

Let me be blunt: I don’t care about you.

How can I, if I don’t even know you?

Isn’t it quite presumptuous to assume I had you in mind, when I created this course?

I teach the subject at hand.

Whether you can acquire that knowledge fast or slow is all up to your own thinking abilities.

Your learning potential has nothing to do with me.

If you’re talented, maybe you can learn faster. If you lack talent, maybe you learn more slowly.

One cannot simply choose to learn something in a specified timeframe.

For example, could I say, “I want to learn quantum mechanics at a PhD level by next month”?

Any quantum physicist would declare me insane!

Furthermore, trying to “learn” things as “fast as possible” is cheap. If you want a summary, just go to Wikipedia.

If you want to be a cheap clone of machine learning engineers who actually know machine learning, believe me that your employer and coworkers will figure it out very soon.


Stop telling me to meet the prerequisites!

If I tell you that you need to work on your prerequisites, it is very very likely that you do.

Many people are offended by this, I don’t understand why.

Your feelings are irrelevant.

If you don’t understand a concept from calculus, probability, programming, whatever – you must handle it.

It’s completely fact-driven.

If you’re bad at calculus, improve your calculus.

There are no feelings involved, no need for you to get angry.

Many people don’t realize (or forget) that machine learning is taught in 3rd or 4th year and beyond – after the student has passed their calculus, linear algebra, probability, and basic programming courses.

If you want to try to learn ML without this background knowledge, basically you’re asking to learn “fake” ML (many marketers out there teach that, go to them).

Many people try to blame me for it or get angry with me as if it were my fault.

I didn’t invent ML.

Whether it depends on those topics or not, is not my choice.

Many beginners make the mistake of trying to invent their own prerequisites. They say “I have a PhD in economics”. Sure, but if you didn’t remember what you learned in calculus, then it wasn’t very useful, was it?

If I give you an instruction, it’s not up to your interpretation.

If I tell you go to left and you decide going right is a better option, then the consequences are your own responsibility.

Furthermore, many people assume that just because they passed a course, that it’s enough.

If you got a 60%, then there’s still 40% you are missing (roughly speaking).

It’s still your job to improve yourself, not my job to remind you.

If you took calculus, but you still don’t understand or can’t do basic homework problems – it’s your job to improve your ability and catch up.


Please just explain all the steps clearly, you are missing so many steps!

I find this one funny. Whenever I ask, “Which steps are missing?” there is never any answer.

Students who make this claim sound like they should be able to provide plentiful examples, yet provide none.

This is because I always show every single step.

Sometimes, people presume that I can predict their shortcomings / ability to understand a step.

Think more logically: how can I predict what you know or don’t know?

Unless many other students have asked me the same question, there is no way for me to know what gaps you have in your calculus, probability, etc. that prevents you from understanding.

That’s why it’s your job to point out what you think is missing, so I can tell you how to fill the gap.

If you think a step is missing, I hereby challenge you to use the Q&A and tell me what it is.


I’m your customer, therefore you must do as I demand


I always find this one hilarious.

This reminds me of the kinds of people who go to a restaurant and ask to sample all the food (yikes).

“I am the customer, I am right! If you want me to be your customer you will do as I say!” (hint: those restaurants don’t want you as a customer)

The elements of the transaction should be obvious.

You are buying a course. You’re not buying a tutoring service. You’re not buying a consulting service. You’re not buying an on-demand Q&A.

My 1-on-1 rate is $250/h, and those people must schedule ahead of time (a.k.a. not on-demand). To think you are entitled to on-demand services beyond the course materials for a fraction of the cost is beyond belief.

I recommend checking out the article “Beginner’s Guide to the Q&A” for more info.


Why do we keep reinventing the wheel? Professionals use APIs!

Mostly students who are inexperienced with computer science and coding say things like this.

What do you learn in a CS101 class? Typically, you’ll do exercises like implementing strings and linked lists.

Does this mean you will implement strings and linked lists in the “professional world”?

Of course not. We have libraries for that.

Unless you are in the 99.999th percentile, you won’t be implementing strings or linked lists.

Then why do we do it?

To make you suffer by constantly reinventing the wheel?

Clearly, universities around the world are not pumping out world-class software engineers by making them perform “useless” exercises.

If you don’t understand why such exercises are good and useful, it’s your understanding that’s the problem.

Similarly, in machine learning, we “reinvent” (a better word is “derive”) algorithms and code because that’s what leads to the best understanding.

Many beginners try to opt for “fact memorization”. Tell me when to use the algorithm, tell me the pros and cons, tell me the best practices — no, no, and no!

Someone with zero understanding can memorize facts. If that is you, you’re totally missing the point.

Everyone loves Richard Feynman right? Well let’s see what he has to say about this:


I came to your course after reading 10 books about X topic, I wanted to learn something new, but it offered nothing

This is very illogical.

Let’s pick an arbitrary subject: calculus.

You’re saying you read 10 textbooks on calculus, and by taking an online calculus course, you expected to learn something new or different?

In what way does this make sense?

Calculus has already been invented.

Obviously, you shouldn’t expect anyone to have invented their own version of calculus to teach you online. That’s very silly!

Furthermore, if you’ve already read 10 textbooks and you still don’t get it – there’s probably something wrong with you…

Whether you take 10 online courses about calculus, 10 college courses about calculus, or read 10 books about calculus, obviously they are all going to be very similar! Calculus is calculus. It’s already been invented.


Why don’t you implement algorithms in stages, like a website or app?

Excellent question!

I think building things in stages is a fantastic approach. I am very enthusiastic about this way of learning.

It’s how I learned Ruby on Rails, ReactNative, and a few other app technologies.

The thing is: this does not apply to machine learning.

Let’s consider a basic web application for example. This very naturally comes in stages.

Maybe the first step is to simply generate the project and have a simple static homepage. Awesome!

Maybe the second step is to create the user model (which includes the corresponding database table) with a name and email. Awesome!

Maybe the third step is to create user authentication functionality. This might require a database migration, new functions in the user controller, and so forth. Awesome!

Maybe the fourth step is creating forms so that the user can actually sign up, login, and logout. Awesome!

Maybe the fifth step is creating posts that the user can create on your website. So you have to create a new Post model, relevant forms for creating and editing posts, etc.

Later on, there will be steps for creating the user interface, sharing, adding friends, blocking other users, etc etc etc.

Building web apps is very naturally a “staged” process.

On the other hand, machine learning does not work that way.

Generally speaking, implementing an algorithm works like this. You start with a shell, such as:

class Model:
  def fit(self, X, Y):
    # your training algorithm
  def predict(self, X):
    # your inference algorithm

This will be the case pretty much no matter what algorithm you are learning about. There are no “stages” to this. You learn the algorithm, and then you put it into code.

In some cases (like Linear Regression) both learning and inference are just one line of code. In other cases (like Hidden Markov Models), both learning and inference are many lines of code.

But there are no “intermediate steps” in which the algorithm produces 10% useful stuff without the other 90%.

It just doesn’t work that way.

Either the whole thing works, or the whole thing doesn’t work. There are no “stages”.

If you are learning about an API rather than an algorithm, it may appear to be in stages. For example:

# load in the data
# build the model
# train the model
# evaluate the model

However, unlike the web app example, these are all strongly interconnected. They are also all focused solely on one single thing: the particular model you’re learning about.

“Loading in the data” is not a stage. That doesn’t do anything in and of itself, and by itself it tells you nothing about the model. Furthermore, if it’s anything like a typical API (Scikit-Learn, Keras), training and evaluation with one line of code each are hardly “stages”.

Creating a database table and corresponding models and controllers is a stage. Creating the header on your landing page is a stage. Calling “” is not a stage.