Lazy Programmer

Your source for the latest in deep learning, big data, data science, and artificial intelligence. Sign up now

New course! Reinforcement Learning in Python

January 27, 2017


I would like to announce my latest course – Artificial Intelligence: Reinforcement Learning in Python.

This has been one of my most requested topics since I started covering deep learning. This course has been brewing in the background for months.

The result: This is my most MASSIVE course yet.

Usually, my courses will introduce you to a handful of new algorithms (which is a lot for people to handle already). This course covers SEVENTEEN (17!) new algorithms.

This will keep you busy for a LONG time.

If you’re used to supervised and unsupervised machine learning, realize this: Reinforcement Learning is a whole new ball game.

There are so many new concepts to learn, and so much depth. It’s COMPLETELY different from anything you’ve seen before.

That’s why we build everything slowly, from the ground up.

There’s tons of new theory, but as you’ve come to expect, anytime we introduce new theory it is accompanied by full code examples.

What is Reinforcement Learning? It’s the technology behind self-driving cars, AlphaGo, video game-playing programs, and more.

You’ll learn that while deep learning has been very useful for tasks like driving and playing Go, it’s in fact just a small part of the picture.

Reinforcement Learning provides the framework that allows deep learning to be useful.

Without reinforcement learning, all we have is a basic (albeit very accurate) labeling machine.

With Reinforcement Learning, you have intelligence.

Reinforcement Learning has even been used to model processes in psychology and neuroscience. It’s truly the closest thing we have to “machine intelligence” and “general AI”.

What are you waiting for? Sign up now!!


#artificial intelligence #deep learning #reinforcement learning

Go to comments

New Years Udemy Coupons! All Udemy Courses only $10

January 1, 2017

Act fast! These $10 Udemy Coupons expire in 10 days.

Ensemble Machine Learning: Random Forest and AdaBoost

Deep Learning Prerequisites: Linear Regression in Python

Deep Learning Prerequisites: Logistic Regression in Python

Deep Learning in Python

Practical Deep Learning in Theano and TensorFlow

Deep Learning: Convolutional Neural Networks in Python

Unsupervised Deep Learning in Python

Deep Learning: Recurrent Neural Networks in Python

Advanced Natural Language Processing: Deep Learning in Python

Easy Natural Language Processing in Python

Cluster Analysis and Unsupervised Machine Learning in Python

Unsupervised Machine Learning: Hidden Markov Models in Python

Data Science: Supervised Machine Learning in Python

Bayesian Machine Learning in Python: A/B Testing

SQL for Newbs and Marketers

How to get ANY course on Udemy for $10 (please use my coupons above for my courses):

Click here for a link to all courses on the site:

Click here for a great calculus prerequisite course:

Click here for a great Python prerequisite course:

Click here for a great linear algebra 1 prerequisite course:

Click here for a great linear algebra 2 prerequisite course:

Go to comments

New course! Ensemble Machine Learning in Python: Random Forest and AdaBoost

December 25, 2016


[Skip to the bottom if you just want the coupon]

This course is all about ensemble methods.

We’ve already learned some classic machine learning models like k-nearest neighbor and decision tree. We’ve studied their limitations and drawbacks.

But what if we could combine these models to eliminate those limitations and produce a much more powerful classifier or regressor?

In this course you’ll study ways to combine models like decision trees and logistic regression to build models that can reach much higher accuracies than the base models they are made of.

In particular, we will study the Random Forest and AdaBoost algorithms in detail.

To motivate our discussion, we will learn about an important topic in statistical learning, the bias-variance trade-off. We will then study the bootstrap technique and bagging as methods for reducing both bias and variance simultaneously.

We’ll do plenty of experiments and use these algorithms on real datasets so you can see first-hand how powerful they are.

Since deep learning is so popular these days, we will study some interesting commonalities between random forests, AdaBoost, and deep learning neural networks.

Go to comments

Announcing Data Science: Supervised Machine Learning in Python (Less Math, More Action!)

September 16, 2016


If you don’t want to read about the course and just want the 88% OFF coupon code, skip to the bottom.

In recent years, we’ve seen a resurgence in AI, or artificial intelligence, and machine learning.

Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts.

Google’s AlphaGo program was able to beat a world champion in the strategy game go using deep reinforcement learning.

Machine learning is even being used to program self driving cars, which is going to change the automotive industry forever. Imagine a world with drastically reduced car accidents, simply by removing the element of human error.

Google famously announced that they are now “machine learning first”, meaning that machine learning is going to get a lot more attention now, and this is what’s going to drive innovation in the coming years.

Machine learning is used in many industries, like finance, online advertising, medicine, and robotics.

It is a widely applicable tool that will benefit you no matter what industry you’re in, and it will also open up a ton of career opportunities once you get good.

Machine learning also raises some philosophical questions. Are we building a machine that can think? What does it mean to be conscious? Will computers one day take over the world?

The best part about this course is that it requires WAY less math than my usual courses; just some basic probability and geometry, no calculus!

In this course, we are first going to discuss the K-Nearest Neighbor algorithm. It’s extremely simple and intuitive, and it’s a great first classification algorithm to learn. After we discuss the concepts and implement it in code, we’ll look at some ways in which KNN can fail.

It’s important to know both the advantages and disadvantages of each algorithm we look at.

Next we’ll look at the Naive Bayes Classifier and the General Bayes Classifier. This is a very interesting algorithm to look at because it is grounded in probability.

We’ll see how we can transform the Bayes Classifier into a linear and quadratic classifier to speed up our calculations.

Next we’ll look at the famous Decision Tree algorithm. This is the most complex of the algorithms we’ll study, and most courses you’ll look at won’t implement them. We will, since I believe implementation is good practice.

The last algorithm we’ll look at is the Perceptron algorithm. Perceptrons are the ancestor of neural networks and deep learning, so they are important to study in the context of machine learning.

One we’ve studied these algorithms, we’ll move to more practical machine learning topics. Hyperparameters, cross-validation, feature extraction, feature selection, and multiclass classification.

We’ll do a comparison with deep learning so you understand the pros and cons of each approach.

We’ll discuss the Sci-Kit Learn library, because even though implementing your own algorithms is fun and educational, you should use optimized and well-tested code in your actual work.

We’ll cap things off with a very practical, real-world example by writing a web service that runs a machine learning model and makes predictions. This is something that real companies do and make money from.

All the materials for this course are FREE. You can download and install Python, Numpy, and Scipy with simple commands on Windows, Linux, or Mac.

UPDATE: New coupon if the above is sold out:

#data science #machine learning #matplotlib #numpy #pandas #python

Go to comments

New course – Natural Language Processing: Deep Learning in Python part 6

August 9, 2016


[Scroll to the bottom for the early bird discount if you already know what this course is about]

In this course we are going to look at advanced NLP using deep learning.

Previously, you learned about some of the basics, like how many NLP problems are just regular machine learning and data science problems in disguise, and simple, practical methods like bag-of-words and term-document matrices.

These allowed us to do some pretty cool things, like detect spam emails, write poetry, spin articles, and group together similar words.

In this course I’m going to show you how to do even more awesome things. We’ll learn not just 1, but 4 new architectures in this course.

First up is word2vec.

In this course, I’m going to show you exactly how word2vec works, from theory to implementation, and you’ll see that it’s merely the application of skills you already know.

Word2vec is interesting because it magically maps words to a vector space where you can find analogies, like:

  • king – man = queen – woman
  • France – Paris = England – London
  • December – Novemeber = July – June

We are also going to look at the GLoVe method, which also finds word vectors, but uses a technique called matrix factorization, which is a popular algorithm for recommender systems.

Amazingly, the word vectors produced by GLoVe are just as good as the ones produced by word2vec, and it’s way easier to train.

We will also look at some classical NLP problems, like parts-of-speech tagging and named entity recognition, and use recurrent neural networks to solve them. You’ll see that just about any problem can be solved using neural networks, but you’ll also learn the dangers of having too much complexity.

Lastly, you’ll learn about recursive neural networks, which finally help us solve the problem of negation in sentiment analysis. Recursive neural networks exploit the fact that sentences have a tree structure, and we can finally get away from naively using bag-of-words.

All of the materials required for this course can be downloaded and installed for FREE. We will do most of our work in Numpy and Matplotlib,and Theano. I am always available to answer your questions and help you along your data science journey.

See you in class!

UPDATE: New coupon if the above is sold out:

#deep learning #GLoVe #natural language processing #nlp #python #recursive neural networks #tensorflow #theano #word2vec

Go to comments

New course – Deep Learning part 5: Recurrent Neural Networks in Python

July 14, 2016


New course out today – Recurrent Neural Networks in Python: Deep Learning part 5.

If you already know what the course is about (recurrent units, GRU, LSTM), grab your 50% OFF coupon and go!:

Like the course I just released on Hidden Markov Models, Recurrent Neural Networks are all about learning sequences – but whereas Markov Models are limited by the Markov assumption, Recurrent Neural Networks are not – and as a result, they are more expressive, and more powerful than anything we’ve seen on tasks that we haven’t made progress on in decades.

Sequences appear everywhere – stock prices, language, credit scoring, and webpage visits.

Recurrent neural networks have a history of being very hard to train. It hasn’t been until recently that we’ve found ways around what is called the vanishing gradient problem, and since then, recurrent neural networks have become one of the most popular methods in deep learning.

If you took my course on Hidden Markov Models, we are going to go through a lot of the same examples in this class, except that our results are going to be a lot better.

Our classification accuracies will increase, and we’ll be able to create vectors of words, or word embeddings, that allow us to visualize how words are related on a graph.

We’ll see some pretty interesting results, like that our neural network seems to have learned that all religions and languages and numbers are related, and that cities and countries have hierarchical relationships.

If you’re interested in discovering how modern deep learning has propelled machine learning and data science to new heights, this course is for you.

I’ll see you in class.

Click here for 50% OFF:

#data science #deep learning #gru #lstm #machine learning #word vectors

Go to comments

New course: Unsupervised Deep Learning in Python

May 15, 2016


This course is the next logical step in my deep learning, data science, and machine learning series. I’ve done a lot of courses about deep learning, and I just released a course about unsupervised learning, where I talked about clustering and density estimation. So what do you get when you put these 2 together? Unsupervised deep learning!

In these course we’ll start with some very basic stuff – principal components analysis (PCA), and a popular nonlinear dimensionality reduction technique known as t-SNE (t-distributed stochastic neighbor embedding).

Next, we’ll look at a special type of unsupervised neural network called the autoencoder. After describing how an autoencoder works, I’ll show you how you can link a bunch of them together to form a deep stack of autoencoders, that leads to better performance of a supervised deep neural network. Autoencoders are like a non-linear form of PCA.

Last, we’ll look at restricted Boltzmann machines (RBMs). These are yet another popular unsupervised neural network, that you can use in the same way as autoencoders to pretrain your supervised deep neural network. I’ll show you an interesting way of training restricted Boltzmann machines, known as Gibbs sampling, a special case of Markov Chain Monte Carlo, and I’ll demonstrate how even though this method is only a rough approximation, it still ends up reducing other cost functions, such as the one used for autoencoders. This method is also known as Contrastive Divergence or CD-k. As in physical systems, we define a concept called free energy and attempt to minimize this quantity.

Finally, we’ll bring all these concepts together and I’ll show you visually what happens when you use PCA and t-SNE on the features that the autoencoders and RBMs have learned, and we’ll see that even without labels the results suggest that a pattern has been found.

All the materials used in this course are FREE. Since this course is the 4th in the deep learning series, I will assume you already know calculus, linear algebra, and Python coding. You’ll want to install Numpy andTheano for this course. These are essential items in your data analytics toolbox.

If you are interested in deep learning and you want to learn about modern deep learning developments beyond just plain backpropagation, including using unsupervised neural networks to interpret what features can be automatically and hierarchically learned in a deep learning system, this course is for you.

Get your EARLY BIRD coupon for 50% off here:

Go to comments

New Deep Learning course on Udemy

February 26, 2016


This course continues where my first course, Deep Learning in Python, left off. You already know how to build an artificial neural network in Python, and you have a plug-and-play script that you can use for TensorFlow.

You learned about backpropagation (and because of that, this course contains basically NO MATH), but there were a lot of unanswered questions. How can you modify it to improve training speed? In this course you will learn about batch and stochastic gradient descent, two commonly used techniques that allow you to train on just a small sample of the data at each iteration, greatly speeding up training time.

You will also learn about momentum, which can be helpful for carrying you through local minima and prevent you from having to be too conservative with your learning rate. You will also learn aboutadaptive learning rate techniques like AdaGrad and RMSprop which can also help speed up your training.

In my last course, I just wanted to give you a little sneak peak at TensorFlow. In this course we are going to start from the basics so you understand exactly what’s going on – what are TensorFlow variables and expressions and how can you use these building blocks to create a neural network? We are also going to look at a library that’s been around much longer and is very popular for deep learning – Theano. With this library we will also examine the basic building blocks – variables, expressions, and functions – so that you can build neural networks in Theano with confidence.

Because one of the main advantages of TensorFlow and Theano is the ability to use the GPU to speed up training, I will show you how to set up a GPU-instance on AWS and compare the speed of CPU vs GPU for training a deep neural network.

With all this extra speed, we are going to look at a real dataset – the famous MNIST dataset (images of handwritten digits) and compare against various known benchmarks.

#adagrad #aws #batch gradient descent #deep learning #ec2 #gpu #machine learning #nesterov momentum #numpy #nvidia #python #rmsprop #stochastic gradient descent #tensorflow #theano

Go to comments

A Tutorial on Autoencoders for Deep Learning

December 31, 2015

Despite its somewhat initially-sounding cryptic name, autoencoders are a fairly basic machine learning model (and the name is not cryptic at all when you know what it does).

Autoencoders belong to the neural network family, but they are also closely related to PCA (principal components analysis).

Some facts about the autoencoder:

  • It is an unsupervised learning algorithm (like PCA)
  • It minimizes the same objective function as PCA
  • It is a neural network
  • The neural network’s target output is its input

The last point is key here. This is the architecture of an autoencoder:


So the dimensionality of the input is the same as the dimensionality of the output, and essentially what we want is x’ = x.

It can be shown that the objective function for PCA is:

$$ J = \sum_{n=1}^{N} |x(n) – \hat{x}(n)|^2 $$

Where the prediction \( \hat{x}(n) = Q^{-1}Qx(n) \).

Q can be the full transformation matrix (which would result in getting exactly the old x back), or it can be a “rank k” matrix (i.e. keeping the k-most relevant eigenvectors), which would then result in only an approximation of x.

So the objective function can be written as:

$$ J = \sum_{n=1}^{N} |x(n) – Q^{-1}Qx(n)|^2 $$

Now let’s return to autoencoders.

Recall that to get the value at the hidden layer, we simply multiply the input->hidden weights by the input.

Like so:

$$ z = f(Wx) $$

And to get the value at the output, we multiply the hidden->output weights by the hidden layer values, like so:

$$ y = g(Vz) $$

The choice of \( f \) and \( g \) is up to us, we just have to know how to take the derivative for backpropagation.

We are of course free to make them “identity” functions, such that:

$$ y = g(V f(Wx)) = VWx $$

This gives us the objective:

$$ J = \sum_{n=1}^{N} |x(n) – VWx(n)|^2 $$

Which is the same as PCA!


If autoencoders are similar to PCA, why do we need autoencoders?

Autoencoders are much more flexible than PCA.

Recall that with neural networks we have an activation function – this can be a “ReLU” (aka. rectifier), “tanh” (hyperbolic tangent), or sigmoid.

This introduces nonlinearities in our encoding, whereas PCA can only represent linear transformations.

The network representation also means you can stack autoencoders to form a deep network.


Cool theory bro, but what can autoencoders actually do for me?

Good question!

Similar to PCA – autoencoders can be used for finding a low-dimensional representation of your input data. Why is this useful?

Some of your features may be redundant or correlated, resulting in wasted processing time and overfitting in your model (too many parameters).

It is thus ideal to only include the features we need.

If your “reconstruction” of x is very accurate, that means your low-dimensional representation is good.

You can then use this transformation as input into another model.


Training an autoencoder

Since autoencoders are really just neural networks where the target output is the input, you actually don’t need any new code.

Suppose we’re working with a sci-kit learn-like interface.

Instead of:, Y)

You would just have:, X)

Pretty simple, huh?

All the usual neural network training strategies work with autoencoders too:

  • backpropagation
  • regularization
  • dropout
  • RBM pre-training

If you want to get good with autoencoders – I would recommend trying to take some data and an existing neural network package you’re comfortable with – and see what low-dimensional representation you can come up with. How many dimensions are there?


Where can I learn more?

Autoencoders are part of a family of unsupervised deep learning methods, which I cover in-depth in my course, Unsupervised Deep Learning in Python. We discuss how to stack autoencoders to build deep belief networks, and compare them to RBMs which can be used for the same purpose. We derive all the equations and write all the code from scratch – no shortcuts. Ask me for a coupon so I can give you a discount!

P.S. “Autoencoders” means “encodes itself”. Not so cryptic now, right?

Leave a comment!

#autoencoders #deep learning #machine learning #pca #principal components analysis #unsupervised learning

Go to comments

Logistic Regression in Python video course

November 11, 2015

Hi all!

Do you ever get tired of reading walls of text, and just want a nice video or 10 to explain to you the magic of logistic regression and how to program it with Python?

Look no further, that video course is here.

#big data #data science #logistic regression #neural networks #numpy #python

Go to comments