April 1, 2020
April 1, 2020
March 27, 2020
In this Data Science Interview Questions series, we’re going to answer the question:
Why do deep learning libraries have functions like “softmax_cross_entropy_with_logits v2”?
Why can’t we just use the formulas we learned in class?
What do these functions do and how?
Click to watch the video below:
Go to comments
March 10, 2020
One of the most common complaints I hear from students is: Why do I have to learn all this math? Why isn’t there a library to do what I want?
Someone recently made this proclamation to me: “You should explain that your courses are for college students, not industry professionals”.
This made me laugh very hard.
In this article, I will refer to students who make such proclamations as “ML wannabes” for lack of a better term, because people who actually do ML generally know better than this.
Yes, even Geoffrey Hinton, Yann LeCun, and Yoshua Bengio have to choose between “academia” and “industry”.
But are they choosing between using Tensorflow vs. exploring the fundamental ideas in machine learning (which necessarily involves lots of theoretical thinking and math)?
I think it’s clear that Geoffrey Hinton isn’t sitting there and saying, “screw all this math, let me just plug my data into Keras”.
Ok, but you and I are not Geoffrey Hinton. So what about us?
When ML wannabes say that my courses are for college students and not professionals, my immediate thought is: What kind of so-called “professionals” do you work with?
Are they fake professionals?
What do you think college students do after they graduate college?
I hope these questions aren’t too philosophical… it’s a pretty standard path: college students graduate college, then work as a professional.
Ergo, professionals are former college students. They have all the knowledge of a college student, and then some.
So isn’t it the case then that being a professional means that they are now experts at all this “math stuff”?
By that logic, shouldn’t it be the case that professionals are the best-equipped to learn all this “math stuff”?
You are not choosing between having an understanding of the math behind machine learning and its practical application.
Being effective at applying machine learning practically involves having a base level of theoretical understanding.
Conversely, having a good understanding of machine learning in theory involves a base level of understanding of how it will be applied in the real world.
i.e. It’s “AND”, not “OR”. You don’t get to choose between these. If you miss one, you’ll be bad at the other. You need both.
Notice that I said a base level of understanding: you don’t have to do a PhD in statistical learning theory. In fact, I hate statistical learning theory.
Nothing I teach involves PhD-level math, so if you think that’s what it is, then you are overestimating everything, including your own skill. I always find it funny when students say “you need your PhD to do this math”. Actually, saying stuff like that just makes YOU look silly because you think you’re closer to it than you really are. In fact, it’s not “PhD math” you’re having trouble with, it’s just undergraduate math…
Even worse: the people who tell me they have a PhD and that’s why they know what they’re talking about. This is the funniest. You have a PhD and you admit that you still have trouble with undergraduate math? Isn’t that just making you look bad? Very silly indeed.
You may be excused if it’s been 20 years since you graduated college and you’re not doing math or algorithms everyday.
Ok, but then what are you doing? Why did they hire you in the first place?
Again, saying all this stuff just makes YOU look bad.
All that stuff you learned in college went out the window?
You paid thousands of dollars and learned nothing useful whatsoever?
Well maybe you got a job that only requires doing database queries and building report dashboards all day.
You got comfortable.
No problem with that. It pays well. It’s steady. You don’t have to bang your head against the wall all the time.
But now what? What are you going to do about it?
Are you going to blame other people for your situation?
If I want to go back to being a professional tennis player after a few years off the courts due to injury, is it my job to train myself to back a world-class level, or is it my opponents’ job to go easy on me?
Coding interviews at Google, Facebook, Amazon, etc. are great examples of why, as a professional, you can’t simply forget everything you learned in college.
Not knowing “college material” doesn’t make you “not a college student”, it makes you a bad professional.
i.e. You’re supposed to know this stuff, and yet you don’t.
You can’t say “I’m not a college student”, because that doesn’t excuse you from not knowing this stuff anyway.
What about people who didn’t go to college? There are tons of stories out there about people who have worked their way up from scratch on their own. They learned what they had to in order to pass the coding interview.
They did not say, “since I am not a college student, the company must change their standards when they hire me”.
They did all the same work as a college student, and in fact, it is even more admirable that they taught themselves!
So as an ML Wannabe, realize that saying “I’m not a college student” is not an excuse, it’s merely equivalent to saying, “I’m not a good professional, I’m a professional that lacks knowledge”.
There is a funny contradiction when it comes to coding interviews:
The wannabes always say, “industry doesn’t care about academics”.
Then when it comes to these coding interviews that they have trouble with, they complain: “industry cares too much about academics!”
So which is it?
That is why professionals at these tech companies do such great work: they are not just professionals. They are professionals who apply what they learned in college on a daily basis.
More on that next.
With hindsight, we can observe with great awe at how Google became the giant it is today.
It all started with a simple Markov model (the kind of math ML Wannabes try to avoid).
This Markov model was a model of links to webpages on the Internet, and this allowed the founders of Google to create the most powerful search engine the world had ever seen.
Of course there were engineering challenges as well. How do you find the eigenvalues of a matrix with over a million rows and columns?
“Aha!” You say. The Lazy Programmer is wrong. Clearly this is a practical engineering problem and not an academic one!
Sorry to say, but you are still wrong.
What library do you use to factor a million x million matrix? Oh right, one doesn’t exist.
Google didn’t just pioneer the mathematics of search, they also pioneered the field of big data which is a subset of distributed computing.
Why didn’t they just use the MapReduce library? Oh right, because they had to invent it first!
If you don’t believe that this is an “academic” subject, you can read the many papers Google has put out on its file system, global databases, etc.
I talk a lot about math in this article but another thing ML Wannabes really hate is programming things on their own. (They prefer to use libraries that involve just a few lines of code to get the job done quickly.)
So what happens when the library you want doesn’t exist?
Do you say screw it and move on to something else?
Well that’s what differentiates the leaders and the followers.
And surely one must ask themselves: is creating a billion dollar company practical?
Whether you agree with these hiring practices or not; one can’t dispute results.
You might say: “Everyone around me is a professional, and they would all disagree with YOU, Lazy Programmer!”
You live in a bubble. Everyone does.
It makes sense if you think about it.
Your company hired people of similar aptitude to be on your team to get a particular job done. You are surrounded by like-minded people.
Of course you are.
And if anyone disagreed with you, you probably wouldn’t be friends with them anyway.
The likelihood of you being surrounded by opposing viewpoints is small.
How would your team get anything done if you could never agree?
But you can’t make the assumption that whatever is in your immediate radius applies uniformly throughout the rest of the world.
There are tens of thousands of STEM undergraduates going into the workforce each year.
Do you think they just automatically forget their undergraduate training?
Is the past 4 years simply erased from their minds?
No – instead, they become these coveted professional-college student hybrids you so fear.
I get it.
At some point, you want to stop thinking so hard.
You want to have a family.
You want to start taking up other hobbies that do not involve being a geek.
I can’t say that won’t be me someday.
In that case, go for that cushy job where it’s a little easier and you get to use all the libraries you want and never have to think about calculus and graph algorithms.
There’s nothing wrong with that.
A comfortable life, a comfortable software developer salary…
This is an excellent goal to have in life.
But you can’t have your cake and eat it too.
If you want to be a real professional (and not a wannabe) then you have to put in the work.
I’m just a vessel for information. I take machine learning and bring it to you.
I did not invent machine learning. So if there’s math, there’s math because the guy who invented it used that math.
Don’t blame me.
If you want to do machine learning, then accept what machine learning is.
You can’t choose to do machine learning, and then refuse to do all the work that everyone else did.
What makes you feel so self-entitled that you think everyone else has to do it except you?
At the very least, you should be interested and enthusiastic about gaining new knowledge – not actively trying to avoid it.
Whether you learn “top-down” or “bottom-up”, you’re going to have to answer the hard questions sooner or later.
If you choose this path, then it’s your job to make the journey.
Don’t expect others to carry you.Go to comments
March 3, 2020
Hello deep learning and AI enthusiasts!
As we all know, the near future is somewhat uncertain. With an invisible virus spreading around the world at an alarming rate, some experts have suggested that it may reach a significant portion of the population.
Schools may close, you may be ordered to work from home, or you may want to avoid going outside altogether. This is not fiction – it’s already happening.
There will be little warning, and as students of science and technology, we should know how rapidly things can change when we have exponential growth (just look at AI itself).
Have you decided how you will spend your time?
I find moments of quiet self-isolation to be excellent for learning advanced or difficult concepts – particularly those in machine learning and artificial intelligence.
To that end, I’ll be releasing several coupons today – hopefully that helps you out and you’re able to study along with me.
Despite the fact that I just released a huge course on Tensorflow 2, this course is more relevant than ever. You might take a course that uses batch norm, adam optimization, dropout, batch gradient descent, etc. without any clue how they work. Perhaps, like me, you find doing “batch norm in 1 line of code” to be unsatisfactory. What’s really going on?
And yes, although it was originally designed for Tensorflow 1 and Theano, everything has been done in Tensorflow 2 as well (you’ll see what I mean).
Learn about awesome algorithms such as A2C, DDPG, and Evolution Strategies (ES). This course continues where my first Deep Reinforcement Learning course left off and is the third course in my Reinforcement Learning series.
A lot of people think SVMs are obsolete. Wrong! A lot of you students want a nice “plug-and-play” model that works well out of the box. Guess what one of the best models is for that? SVM!
Many of the concepts from SVMs are extremely useful today – like quadratic programming (used for portfolio optimization) and constrained optimization.
Constrained optimization appears in modern Reinforcement Learning, for you non-believers (see: TRPO, PPO).
Well, I don’t need to tell you how popular GANs are. They sparked a mini-revolution in deep learning with the ability to generate photo-realistic images, create music, and enhance low-resolution photos.
Variational autoencoders are a great (but often forgotten by those beginner courses) tool for understanding and generating data (much like GANs) from a principled, probabilistic viewpoint.
Ever seen those cool illustrations where they can change a picture of a person from smiling to frowning on a continuum? That’s VAEs in action!
This is one of my favorite courses. Every beginner ML course these days teaches you how to plug into scikit-learn.
This is trivial. Everyone can do this. Nobody will give you a job just because you can write 3 lines of code when there are 1000s of others lining up beside you who know just as much.
It’s so trivial I teach it for FREE.
That’s why, in this course (a real ML course), I teach you how to not just use, but implement each of the algorithms (the fundamental supervised models).
At the same time, I haven’t forgotten about the “practical” aspect of ML, so I also teach you how to build a web API to serve your trained model.
This is the eventual place where many of your machine learning models will end up. What? Did you think you would just write a script that prints your accuracy and then call it a day? Who’s going to use your model?
The answer is, you’re probably going to serve it (over a server, duh) using a web server framework, such as Django, Flask, Tornado, etc.
Never written your own backend web server application before? I’ll show you how.
Alright, that’s all from me. Stay safe out there folks!
Note: these coupons will last 31 days – don’t wait!Go to comments
January 5, 2020
See the corresponding YouTube video lecture here: https://youtu.be/3r5eNV7WZ6g
In this article, I will teach you how to setup your NVIDIA GPU laptop (or desktop!) for deep learning with NVIDIA’s CUDA and CuDNN libraries.
The main thing to remember before we start is that these steps are always constantly in flux – things change and they change quickly in the field of deep learning. Therefore I remind you of my slogan: “Learn the principles, not the syntax“. We are not doing any coding here so there’s no “syntax” per se, but the general idea is to learn the principles at a high-level, don’t try to memorize details which may change on you and confuse you if you forget about what the principles are.
This article is more like a personal story rather than a strict tutorial. It’s meant to help you understand the many obstacles you may encounter along the way, and what practical strategies you can take to get around them.
There are about 10 different ways to install the things we need. Some will work; some won’t. That’s just how cutting-edge software is. If that makes you uncomfortable, well, stop being a baby. Yes, it’s going to be frustrating. No, I didn’t invent this stuff, it is not within my control. Learn the principles, not the syntax!
This article will be organized into the following sections:
If you’ve never setup your laptop for GPU-enabled deep learning before, then you might assume that there’s nothing you need to do beyond buying a laptop with a GPU. WRONG!
You need to have a specific kind of laptop with specific software and drivers installed. Everything must work together.
You can think of all the software on your computer as a “stack” of layers.
At the lowest layer, you have the kernel (very low-level software that interacts with the hardware) and at higher levels you have runtimes and libraries such as SQLite, SSL, etc.
When you write an application, you need to make use of lower-level runtimes and libraries – your code doesn’t just run all by itself.
So, when you install Tensorflow (as an example), that depends on lower-level libraries (such as CUDA and CuDNN) which interact with the GPU (hardware).
If any of the layers in your stack are missing (all the way from the hardware up to high-level libraries), your code will not work.
Low-Level = Hardware
High-Level = Libraries and Frameworks
Not all GPUs are created equal. If you buy a MacBook Pro these days, you’ll get a Radeon Pro Vega GPU. If you buy a Dell laptop, it might come with an Intel UHD GPU.
These are no good for machine learning or deep learning.
You will need a laptop with an NVIDIA GPU.
Some laptops come with a “mobile” NVIDIA GPU, such as the GTX 950m. These are OK, but ideally you want a GPU that doesn’t end with “m”. As always, check performance benchmarks if you want to full story.
I would also recommend at least 4GB of RAM (otherwise, you won’t be able to use larger batch sizes, which will affect training).
In fact, some of the newer neural networks won’t even fit on the RAM to do prediction, never mind training!
One thing you have to consider is if you actually want to do deep learning on your laptop vs. just provisioning a GPU-enabled machine on a service such as AWS (Amazon Web Services).
These will cost you a few cents to a dollar per hour (depending on the machine type), so if you just have a one-off job to run, you may want to consider this option.
I already have a walkthrough tutorial in my course Modern Deep Learning in Python about that, so I assume if you are reading this article, you are rather interested in purchasing your own GPU-enabled computer and installing everything yourself.
Personally, I would recommend Lenovo laptops. The main reason is they always play nice with Linux (we’ll go over why that’s important in the next section). Lenovo is known for their high-quality and sturdy laptops and most professionals who use PCs for work use Thinkpads. They have a long history (decades) of serving the professional community so it’s nearly impossible to go wrong. Other brands generally have lots of issues (e.g. sound not working, WiFi not working, etc.) with Linux.
Here are some good laptops with NVIDIA GPUs:
Lenovo Ideapad L340 Gaming Laptop, 15.6 Inch FHD (1920 X 1080) IPS Display, Intel Core i5-9300H Processor, 8GB DDR4 RAM, 512GB Nvme SSD, NVIDIA GeForce GTX 1650, Windows 10, 81LK00HDUS, Black ($694.95)
This one only has an i5 processor and 8GB of RAM, but on the plus side it’s cost-effective. Note that the prices were taken when I wrote this article; they might change.
Same as above but different specs. 16GB RAM with an i7 processor, but only 256GB of SSD space. Same GPU. So there are some tradeoffs to be made.
2019 Lenovo Legion Y540 15.6″ FHD Gaming Laptop Computer, 9th Gen Intel Hexa-Core i7-9750H Up to 4.5GHz, 24GB DDR4 RAM, 1TB HDD + 512GB PCIE SSD, GeForce GTX 1650 4GB, 802.11ac WiFi, Windows 10 Home ($998.00)
This is the best option in my opinion. Better or equal specs compared to the previous two. i7 processor, 24GB of RAM (32GB would be ideal!), lots of space (1TB HD + 512GB SSD), and the same GPU. Bonus: it’s nearly the same price as the above (currently).
Pricier, but great specs. Same GPU!
Lenovo ThinkPad P53 Mobile Workstation 20QN0018US – Intel Six Core i7-9850H, 16GB RAM, 512GB PCIe Nvme SSD, 15.6″ HDR 400 FHD IPS 500Nits Display, NVIDIA Quadro RTX 5000 16GB GDDR6, Windows 10 Pro ($3,472.69)
If you really want to splurge, consider one of these big boys. Thinkpads are classic professional laptops. These come with real beast GPUs – NVIDIA Quadro RTX 5000 with 16GB of VRAM.
You’ve still got the i7 processor, 16GB of RAM, and a 512GB NVMe SSD (basically a faster version of already-super-fast SSDs). Personally, I think if you’re going to splurge, you should opt for 32GB of RAM and a 1TB SSD.
If you’ve watched my videos, you might be wondering: what about a Mac? (I use a Mac for screen recording).
Macs are great in general for development, and they used to come with NVIDIA GPUs (although those GPUs are not as powerful as the ones currently available for PCs). Support for Mac has dropped off in the past few years, so you won’t be able to install say, the latest version of Tensorflow, CUDA, and CuDNN without a significant amount of effort (I spent probably a day and just gave up). And on top of that the GPU won’t even be that great. Overall, not recommended.
As I mentioned earlier, you probably want to be running Linux (Ubuntu is my favorite).
Why, you might ask?
“Tensorflow works on Windows, so what’s the problem?”
Remember my motto: “Learn the principles, not the syntax“.
What’s the principle here? Many of you probably haven’t been around long enough to know this, but the problem is, many machine learning and deep learning libraries didn’t work with Windows when they first came out.
So, unless you want to wait a year or more after new inventions and software are being made, then try to avoid Windows.
Don’t take my word for it, look at the examples:
There are more examples, but these are the major historical “lessons” I point to for why I normally choose Linux over Windows.
One benefit of using Windows is that installing CUDA is very easy, and it’s very likely that your Windows OS (on your Lenovo laptop) will come with it pre-installed. The original use-case for GPUs was gaming, so it’s pretty user-friendly.
If you purchase one of the above laptops and you choose to stick with Windows, then you will not have to worry about installing CUDA – it’s already there. There is a nice user interface so whenever you need to update the CUDA drivers you can do so with just a few clicks.
Installing CuDNN is less trivial, but the instructions are pretty clear (https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#installwindows). Simply download the zip file, unzip it, copy the files to the locations specified in the instructions, and set a few environment variables. Easy!
TO BE CLEAR:
Aside from the Python libraries below (such as Tensorflow / PyTorch) you need to install 2 things from NVIDIA first:
I always find it useful to have both Windows and Ubuntu on-hand, and if you get the laptop above that has 2 drives (1TB HD and 512GB SSD) dual-booting is a natural choice.
These days, dual booting is not too difficult. Usually, one starts with Windows. Then, you insert your Ubuntu installer (USB stick), and choose the option to install Ubuntu alongside the existing OS. There are many tutorials online you can follow.
Hint: Upon entering the BIOS, you may have to disable the Secure Boot / Fast Boot options.
I already have lectures on how to install Python with and without Anaconda. These days, Anaconda works well on Linux, Mac, and Windows, so I recommend it for easy management of your virtual environments.
Ok, now we get to the hard stuff. You have your laptop and your Ubuntu/Debian OS.
Can you just install Tensorflow and magically start making use of your super powerful GPU? NO!
Now you need to install the “low-level” software that Tensorflow/Theano/PyTorch/etc. make use of – which are CUDA and CuDNN.
This is where things get tricky, because there are many ways to install CUDA and CuDNN, and some of these ways don’t always work (from my experience).
Examples of how things can “randomly go wrong”:
Here is a method that consistently works for me:
Those instructions are subject to change, but basically you can just copy and paste what they give you (don’t copy the below, check the site to get the latest version):
sudo dpkg -i \ http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb sudo apt-get update && sudo apt-get install libcudnn7 libcudnn7-dev
If you decided you hate reinforcement learning and you’re okay with not being able to use new software until it becomes mainstream, then you may have decided you want to stick with Windows.
Luckily, there’s still lots you can do in deep learning.
As mentioned previously, installing CUDA and CuDNN on Windows is easy.
If you did not get a laptop which has CUDA preinstalled, then you’ll have to install it yourself. Go to https://developer.nvidia.com/cuda-downloads, choose the options appropriate for your system (Windows 10 / x86_64 (64-bit) / etc.)
This will give you a .exe file to download. Simply click on it and follow the onscreen prompts.
As mentioned earlier, installing CuDNN is a little more complicated, but not too troublesome. Just go to https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#installwindows and follow NVIDIA’s instructions for where to put the files and what environment variables to set.
Unlike the other libraries we’ll discuss, there are different packages to separate the CPU and GPU versions of Tensorflow.
The Tensorflow website will give you the exact command to run to install Tensorflow (it’s the same whether you are in Anaconda or not).
So you would install it using either:
pip install tensorflow
pip install tensorflow-gpu
Since this article is about GPU-enabled deep learning, you’ll want to install tensorflow-gpu.
UPDATE: Starting with version 2.1, installing “tensorflow” will automatically give you GPU capabilities, so there’s no need to install a GPU-specific version (although the syntax still works).
After installing Tensorflow, you can verify that it is using the GPU:
This will return True if Tensorflow is using the GPU.
Nothing special nowadays! Just do:
pip install torch
To check whether PyTorch is using the GPU, you can use the following commands:
In : import torch In : torch.cuda.current_device() Out: 0 In : torch.cuda.device(0) Out: <torch.cuda.device at 0x7efce0b03be0> In : torch.cuda.device_count() Out: 1 In : torch.cuda.get_device_name(0) Out: 'GeForce GTX 950M' In : torch.cuda.is_available() Out: True
Luckily, Keras is just a wrapper around other libraries such as Tensorflow and Theano. Therefore, there is nothing special you have to do, as long as you already have the GPU-enabled version of the base library.
Therefore, just install Keras as you normally would:
pip install keras
As long as Keras is using Tensorflow as a backend, you can use the same method as above to check whether or not the GPU is being used.
For both Ubuntu and Windows, as always I recommend using Anaconda. In this case, the command to install Theano with GPU support is simply:
conda install theano pygpu
If necessary, further details can be found at:
SIDE NOTE: Unfortunately, I will not provide technical support for your environment setup. You are welcome to schedule a 1-on-1 but availability is limited.Go to comments