# Artificial Intelligence Boxing Day Blowout!

December 26, 2018

#### Deep Learning and AI Courses for just $11.99 # Boxing Day 2018 ### Celebrate the Holidays with New AI & Deep Learning Courses! I’ve been busy making free content and updates for my existing courses, so guess what that means? Everything on sale! For the next week, all my Deep Learning and AI courses are available for just$11.99!

For my courses, please use the coupons below (included in the links), or if you want, enter the coupon code: DEC2018.

For prerequisite courses (math, stats, Python programming) and all other courses, follow the links at the bottom for sales of up to 90% off!

Since ALL courses on Udemy on sale, if you want any course not listed here, just click the general (site-wide) link, and search for courses from that page.

# Neural Ordinary Differential Equations

December 15, 2018

Very interesting paper that got the Best Paper award at NIPS 2018.

“Neural Ordinary Differential Equations” by Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud.

Comes out of Geoffrey Hinton’s Vector Institute in Toronto, Canada (although he is not an author on the paper).

For those of you who have ever programmed simulations of systems of differential equations, the motivation behind this should be quite intuitive.

Recall that a derivative is the same thing as the slope of a tangent line, and can be approximated by the usual “rise over run” formula for small time steps $$\Delta t$$.

$$\frac{dh}{dt} \approx \frac{h(t + \Delta t) – h(t)}{\Delta t}$$

Here’s a picture of that if you forgot what it looks like: Normally, the derivative is known to be some function $$\frac{dh}{dt} = f(h, t)$$.

Your job in writing a simulation is to find out how $$h(t)$$ evolves over time.

Here’s a picture of how that works (using different symbols): Since our job is to find the next value of $$h(t)$$, we can rearrange the above to get:

$$h(t + \Delta t) = h(t) + f(h(t), t) \Delta t$$

Typically the time step is just $$1$$, so we can rewrite the above as:

$$h_{t+1} = h_t + f(h_t, t)$$

Researchers noticed that this looks a lot like the residual network layer that is often used in deep learning!

In a residual network layer, $$h_t$$ represents the input value, $$h_{t+1}$$ represents the output value, and $$f(h_t, t)$$ represents the residual.

Here’s a picture of that (using different symbols): At this point, the question to ask is, if a residual network layer is just a difference equation that approximates a differential equation, can there be a neural network layer that is an actual differential equation?

How would backpropagation be done?

This paper goes over all that and more.