May 9, 2019

May 9, 2019

January 22, 2019

January 1, 2019





December 26, 2018






December 15, 2018
Very interesting paper that got the Best Paper award at NIPS 2018.
“Neural Ordinary Differential Equations” by Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud.
Comes out of Geoffrey Hinton’s Vector Institute in Toronto, Canada (although he is not an author on the paper).
For those of you who have ever programmed simulations of systems of differential equations, the motivation behind this should be quite intuitive.
Recall that a derivative is the same thing as the slope of a tangent line, and can be approximated by the usual “rise over run” formula for small time steps \( \Delta t \).
$$ \frac{dh}{dt} \approx \frac{h(t + \Delta t) – h(t)}{\Delta t}$$
Here’s a picture of that if you forgot what it looks like:
Normally, the derivative is known to be some function \( \frac{dh}{dt} = f(h, t) \).
Your job in writing a simulation is to find out how \( h(t) \) evolves over time.
Here’s a picture of how that works (using different symbols):
Since our job is to find the next value of \( h(t) \), we can rearrange the above to get:
$$ h(t + \Delta t) = h(t) + f(h(t), t) \Delta t $$
Typically the time step is just \( 1 \), so we can rewrite the above as:
$$ h_{t+1} = h_t + f(h_t, t) $$
Researchers noticed that this looks a lot like the residual network layer that is often used in deep learning!
In a residual network layer, \( h_t \) represents the input value, \( h_{t+1} \) represents the output value, and \( f(h_t, t) \) represents the residual.
Here’s a picture of that (using different symbols):
At this point, the question to ask is, if a residual network layer is just a difference equation that approximates a differential equation, can there be a neural network layer that is an actual differential equation?
How would backpropagation be done?
This paper goes over all that and more.
Read the paper here! https://arxiv.org/abs/1806.07366
Go to commentsNovember 14, 2018






September 13, 2018

September 12, 2018



May 21, 2018
This is a great video that explains a lot of what I’ve observed from students trying to machine learning, but put more eloquently than I could have said myself. =)
I’m always having to contend against students who have taken a super easypeasy course, actually learned nothing, but believe they know everything. Then, when they come up against the real content, they believe it’s because the instructor is trying to make the course really “elite” or trying to make them feel “dumb” by including lots of math and/or programming that they can’t understand.
But realize:
We didn’t put math in there just to torture you. If you’re taking a math course, it’s probably going to have math in it.
A student gets frustrated because they don’t understand the real subject, but really they should be frustrated with the instructor who gave them the empty course that provided them with no skill and too much confidence.
This video is about software developers, but if you view it from the perspective of machine learning, everything still applies. Watch the video!
Go to commentsMay 1, 2018
Over the past year, many of you have been asking for a followup on my RNN and Deep NLP courses. I am glad to announce that today, that course is here.
Deep Learning: Advanced NLP and RNNs
I decided to combine both NLP (natural language processing) and RNNs (recurrent neural networks) because these topics are so intertwined it’s almost impossible to talk about one without the other.
In recent years, a few ideas have started to bubble up and have shown themselves to be truly useful, and in this course, I bring those ideas to you.
Let’s start with the applications:
1. I’ve been asked quite a few times about how to do classification when each input can have multiple labels assigned to it. We will do a text classification problem that has data exactly like this.
2. Neural machine translation. One of the most popular applications of Deep NLP. We can’t not do this.
3. Question answering. You can think of this as “reading comprehension”. Can an AI read a story and answer a question about it? Facebook Research made this popular with their bAbI dataset.
4. Speech recognition (see below).
As you know I like to take an abstract view of machine learning. We know that all of the techniques for these applications can be used for yet more applications without any change in code because the “data is the same”. For example, a spam detection dataset looks no different than a sentiment analysis dataset.
In the same vein, neural machine translation is no different from simple versions of question answering and chatbots. So you are really learning how to do all of these things at the same time.
We will of course get a chance to review basics such as LSTMs, GRUs, language modeling, word embeddings, and so forth.
What techniques will we cover? These techniques are what have helped RNNs really work well for NLP in the recent past:
1. Bidirectional RNNs
2. Sequencetosequence models (seq2seq)
So, if you’ve already heard about these and you wanted to learn about them – I hope you are excited!
THERE’S MORE:
This course is NOT just about RNNs but CNNs (convolutional neural networks) as well. This is an advanced course – ALL deep learning is fair game.
Early in the course, you’ll see how we can apply CNNs to text.
You will see that we get results onpar with LSTMs and GRUs.
That’s already pretty neat.
But there’s still more.
If you’re reading this, you automatically get access to the VIP version of the course, which contains EVEN MORE material.
For the first time, I’m releasing a course exclusively on https://deeplearningcourses.com
This course will appear on other sites in the future but you will NOT get the VIP version from those sites.
What’s in the VIP bonus?
It’s basically like an entirely new section of the course.
We will be looking at a topic I’ve wanted to cover for a long time: speech recognition.
Unlike the usual type of NLP stuff which focuses on text, speech recognition focuses on audio.
Text is neat and formatted. When you type the word “the” it’s the same as if I type the word “the”.
The same cannot be said for audio. When you say “the” it sounds different from when I say “the”.
Audio is a realworld, physical signal like images are.
In that sense, speech recognition is more like computer vision.
In fact, you’ll see how we can apply CNNs to this task as well.
I love this section of the course because we get to dive into some very cool, neverbeforeseen material in order to do speech processing – namely timeseries techniques such as the Fourier transform.
You’ll even get a brief glimpse into how the Fourier transform is related to quantum mechanics and Heisenberg’s uncertainty principle!
Enough talk. Get the course here:
Deep Learning: Advanced NLP and RNNs
https://deeplearningcourses.com/c/deeplearningadvancednlp
NOTES:
1. As usual, if you purchase the course on deeplearningcourses.com and you’d like access on Udemy as well, I will do that for you once the course is released there.
2. I’ve made a lot of updates to deeplearningcourses.com recently, so hopefully you find them useful! Always happy to consider feature requests.
3. I recently moved deeplearningcourses.com to a shiny new server, so if you have any problems, please let me know. Everything seems to be running smoothly so far!
Go to comments