June 16, 2021

# VIP Promotion

### The complete Time Series Analysis course has arrived

Hello friends!

2 years ago, I asked the students in my Tensorflow 2.0 course if they’d be interested in a course on time series. The answer was a resounding YES.

Don’t want to read the rest of this little spiel? Just get the coupon:

https://www.udemy.com/course/time-series-analysis/?couponCode=TIMEVIP

(Updated: Expires Oct 17, 2021) https://www.udemy.com/course/time-series-analysis/?couponCode=TIMEVIP4

(note: this VIP coupon expires in 30 days!)

Time series analysis is becoming an increasingly important analytical tool.

• With inflation on the rise, many are turning to the stock market and cryptocurrencies in order to ensure their savings do not lose their value.
• COVID-19 has shown us how forecasting is an essential tool for driving public health decisions.
• Businesses are becoming increasingly efficient, forecasting inventory and operational needs ahead of time.

Let me cut to the chase. This is not your average Time Series Analysis course. This course covers modern developments such as deep learning, time series classification (which can drive user insights from smartphone data, or read your thoughts from electrical activity in the brain), and more.

We will cover techniques such as:

• ETS and Exponential Smoothing
• Holt’s Linear Trend Model
• Holt-Winters Model
• ARIMA, SARIMA, SARIMAX, and Auto ARIMA
• ACF and PACF
• Vector Autoregression and Moving Average Models (VAR, VMA, VARMA)
• Machine Learning Models (including Logistic Regression, Support Vector Machines, and Random Forests)
• Deep Learning Models (Artificial Neural Networks, Convolutional Neural Networks, and Recurrent Neural Networks)
• GRUs and LSTMs for Time Series Forecasting

We will cover applications such as:

• Time series forecasting of sales data
• Time series forecasting of stock prices and stock returns
• Time series classification of smartphone data to predict user behavior

The VIP version of the course (obtained by purchasing the course NOW during the VIP period) will cover even more exciting topics, such as:

• AWS Forecast (Amazon’s state-of-the-art low-code forecasting API)
• GARCH (financial volatility modeling)
• FB Prophet (Facebook’s time series library)
• And MORE (it’s a secret!)

As always, please note that the VIP period may not last forever, and if / when the course becomes “non-VIP”, the VIP contents will be removed. If you purchased the VIP version, you will retain permanent access to the VIP content via my website, simply by letting me know via email you’d like access (you only need to email if I announce the VIP period is ending).

Small note:

I wanted to get this course into your hands early. Some sections are still in the editing stages, particularly:

• Convolutional Neural Networks (done, but more to be added later)
• Recurrent Neural Networks (done, but more to be added later)
• Vector Autoregression
• (VIP) GARCH
• (VIP) FB Prophet
• +MORE VIP CONTENT (it’s a surprise!)

UPDATE: The crossed-out items have since been added. There is no timeline for the remaining “surprise” lectures – it’ll be a surprise! 😉

So what are you waiting for? Get the VIP version of Time Series Analysis NOW:

# NEW COURSE: Financial Engineering and Artificial Intelligence in Python

September 8, 2020

# VIP Promotion

### The complete Financial Engineering course has arrived

Hello once again friends!

Today, I am announcing the VIP version of my latest course: Financial Engineering and Artificial Intelligence in Python.

https://www.udemy.com/course/ai-finance/?couponCode=FINANCEVIP (expires Oct 9, 2020)

https://www.udemy.com/course/ai-finance/?couponCode=FINANCEVIP13 (expires Oct 17, 2021)

(as usual, this coupon lasts only 30 days, so don’t wait!)

This is a MASSIVE (20 hours) Financial Engineering course covering the core fundamentals of financial engineering and financial analysis from scratch. We will go in-depth into all the classic topics, such as:

• Exploratory data analysis, significance testing, correlations
• Alpha and beta
• Advanced Pandas Data Frame manipulation for time series and finance
• Time series analysis, simple moving average, exponentially-weighted moving average
• Holt-Winters exponential smoothing model
• ARIMA and SARIMA
• Efficient Market Hypothesis
• Random Walk Hypothesis
• Time series forecasting (“stock price prediction”)
• Modern portfolio theory
• Efficient frontier / Markowitz bullet
• Mean-variance optimization
• Maximizing the Sharpe ratio
• Convex optimization with Linear Programming and Quadratic Programming
• Capital Asset Pricing Model (CAPM)

In addition, we will look at various non-traditional techniques which stem purely from the field of machine learning and artificial intelligence, such as:

• Regression models
• Classification models
• Unsupervised learning
• Reinforcement learning and Q-learning

We will learn about the greatest flub made in the past decade by marketers posing as “machine learning experts” who promise to teach unsuspecting students how to “predict stock prices with LSTMs”. You will learn exactly why their methodology is fundamentally flawed and why their results are complete nonsense. It is a lesson in how not to apply AI in finance.

### List of VIP-only Contents

As with my Tensorflow 2 release, some of the VIP content will be a surprise and will be released in stages. Currently, the entirety of the Algorithmic Trading sections are VIP sections. Newly added VIP sections include Statistical Factor Models and “The Lazy Programmer Bonus Offer”. Here’s a full list:

Classic Algorithmic Trading – Trend Following Strategy

You will learn how moving averages can be applied to do algorithmic trading.

Forecast returns in order to determine when to buy and sell.

I give you a full introduction to Reinforcement Learning from scratch, and then we apply it to build a Q-Learning trader. Note that this is *not* the same as the example I used in my Tensorflow 2, PyTorch, and Reinforcement Learning courses. I think the example included in this course is much more principled and robust.

Statistical Factor Models

The CAPM is one of the most renowned financial models in history, but did you know it’s only the simplest factor model, with just a single factor? To go beyond just this single factor model, we will learn about statistical factor models, where the multiple “factors” are found automatically using only the data.

The Lazy Programmer Bonus Offer

There are marketers out there who want to capitalize on your enthusiastic interest in finance, and unfortunately what they are teaching you is utter and complete garbage.

They will claim that they can “predict stock prices with LSTMs” and show you charts like this with nearly perfect stock price predictions.

Hint: if they can do this, why do they bother putting effort into making courses? Wouldn’t they already be billionaires?

Have you ever wondered if you are taking such a course from a fake data scientist / marketer? If so, just send me a message, and I will tell you whether or not you are taking such a course. (Hint: many of you are) I will give you a list of mistakes they made so you can look out for them yourself, and avoid “learning” things which will ultimately make YOU look very bad in front of potential future employers.

Believe me, if you ever try to get a job in machine learning or data science and you talk about a project where you “predicted stock prices with LSTMs”, all you will be demonstrating is how incompetent you are. I don’t want to see any of my students falling for this! Save yourself from this embarrassing scenario by taking the “Lazy Programmer Offer”!

Please note: The VIP coupon will work only for the next month (starting from the coupon creation time). It’s unknown whether the VIP period will renew after that time.

After that, although the VIP content will be removed from Udemy, all who purchased the VIP course will get permanent free access to these VIP contents on deeplearningcourses.com.

In case it’s not clear, the process is very easy. For those folks who want the “step-by-step” instructions:

STEP 1) I announce the VIP content will be removed.

STEP 2) You email me with proof that you purchased the course during the VIP period. Do NOT email me earlier as it will just get buried.

STEP 3) I will give you free access to the VIP materials for this course on deeplearningcourses.com.

### Benefits of taking this course

• Learn the knowledge you need to work at top tier investment firms
• Gain practical, real-world quantitative skills that can be applied within and outside of finance
• Make better decisions regarding your own finances

Personally, I think this is the most interesting and action-packed course I have created yet. My last few courses were cool, but they were all about topics which I had already covered in the past! GANs, NLP, Transfer Learning, Recommender Systems, etc etc. all just machine learning topics I have covered several times in different libraries. This course contains new, fresh content and concepts I have never covered in any of my courses, ever.

This is the first course I’ve created that extends into a niche area of AI application. It goes outside of AI and into domain expertise. An in-depth topic such as finance deserves its own course. This is that course. These are topics you will never learn in a generic data science or machine learning course. However, as a student of AI, you will recognize many of our tools and methods being applied, such as statistical inference, supervised and unsupervised learning, convex optimization, and optimal control. This allows us to go deeper than your run of the mill financial engineering course, and it becomes more than just the sum of its parts.

So what are you waiting for?

April 1, 2020

# VIP Promotion

### The complete PyTorch course has arrived

Hello friends!

I hope you are all staying safe. Well, I’m sure you’ve heard enough about that so how about some different news?

Today, I am announcing the VIP version of my latest course: PyTorch: Deep Learning and Artificial Intelligence

https://www.udemy.com/course/pytorch-deep-learning/?couponCode=PYTORCHVIP18 (expires Oct 17, 2021)

This is a MASSIVE (over 22 hours) Deep Learning course covering EVERYTHING from scratch. That includes:

• Machine learning basics (linear neurons)
• ANNs, CNNs, and RNNs for images and sequence data
• Time series forecasting and stock predictions (+ why all those fake data scientists are doing it wrong)
• NLP (natural language processing)
• Recommender systems
• Transfer learning for computer vision
• Deep reinforcement learning and applying it by building a stock trading bot

IN ADDITION, you will get some unique and never-before-seen VIP projects:

Estimating prediction uncertainty

Drawing the standard deviation of the prediction along with the prediction itself. This is useful for heteroskedastic data (that means the variance changes as a function of the input). The most popular application where heteroskedasticity appears is stock prices and stock returns – which I know a lot of you are interested in.

It allows you to draw your model predictions like this:

Sometimes, the data is simply such that a spot-on prediction can’t be made. But we can do better by letting the model tell us how certain it is in its predictions.

Facial recognition with siamese networks

This one is cool. I mean, I don’t have to tell you how big facial recognition has become, right? It’s the single most controversial technology to come out of deep learning. In the past, we looked at simple ways of doing this with classification, but in this section I will teach you about an architecture built specifically for facial recognition.

You will learn how this can work even on small datasets – so you can build a network that recognizes your friends or can even identify all of your coworkers!

You can really impress your boss with this one. Surprise them one day with an app that calls out your coworkers by name every time they walk by your desk. 😉

Please note: The VIP coupon will work only for the next month (ending May 1, 2020). It’s unknown whether the VIP period will renew after that time.

After that, although the VIP content will be removed from Udemy, all who purchased the VIP course will get permanent free access on deeplearningcourses.com.

## Minimal Prerequisites

This course is designed to be a beginner to advanced course. All that is required is that you take my free Numpy prerequisites to learn some basic scientific programming in Python. And it’s free, so why wouldn’t you!?

You will learn things that took me years to learn on my own. For many people, that is worth tens of thousands of dollars by itself.

There is no heavy math, no backpropagation, etc. Why? Because I already have courses on those things. So there’s no need to repeat them here, and PyTorch doesn’t use them. So you can relax and have fun. =)

## Why PyTorch?

All of my deep learning courses until now have been in Tensorflow (and prior to that Theano).

So why learn PyTorch?

Does this mean my future deep learning courses will use PyTorch?

In fact, if you have traveled in machine learning circles recently, you will have noticed that there has been a strong shift to PyTorch.

Case in point: OpenAI switched to PyTorch earlier this year (2020).

Major AI shops such as Apple, JPMorgan Chase, and Qualcomm have adopted PyTorch.

PyTorch is primarily maintained by Facebook (Facebook AI Research to be specific) – the “other” Internet giant who, alongside Google, have a strong vested interest in developing state-of-the-art AI.

But why PyTorch for you and me? (aside from the fact that you might want to work for one of the above companies)

As you know, Tensorflow has adopted the super simple Keras API. This makes common things easy, but it makes uncommon things hard.

With PyTorch, common things take a tiny bit of extra effort, but the upside is that uncommon things are still very easy.

Creating your own custom models and inventing your own ideas is seamless. We will see many examples of that in this course.

For this reason, it is very possible that future deep learning courses will use PyTorch, especially for those advanced topics that many of you have been asking for.

Because of the ease at which you can do advanced things, PyTorch is the main library used by deep learning researchers around the world. If that’s your goal, then PyTorch is for you.

In terms of growth rate, PyTorch dominates Tensorflow. PyTorch now outnumbers Tensorflow by 2:1 and even 3:1 at major machine learning conferences. Researchers hold that PyTorch is superior to Tensorflow in terms of the simplicity of its API, and even speed / performance!

Do you need more convincing?

March 18, 2020

# VIP Promotion

Hello all!

In this post, I am announcing the VIP coupon to my course titled “Artificial Intelligence: Reinforcement Learning in Python”.

There are 2 places to get the course.

1. Udemy, with this VIP coupon: https://www.udemy.com/course/artificial-intelligence-reinforcement-learning-in-python/?couponCode=REINFORCEVIP4 (expires Oct 17, 2021)
2. Deep Learning Courses (coupon automatically applied): https://deeplearningcourses.com/c/artificial-intelligence-reinforcement-learning-in-python

You may recognize this course as one that has already existed in my catalog – however, the course I am announcing today contains ALL-NEW material. The entire course has been gutted and every lecture contained within the course did not exist in the original version.

One of the most common questions I get from students in my PyTorch, Tensorflow 2, and Financial Engineering courses is: “How can I learn reinforcement learning?”

While I do cover RL in those courses, it’s very brief. I’ve essentially summarized 12 hours of material into 2. So by necessity, you will be missing some things.

While that serves as a good way to scratch the surface of RL, it doesn’t give you a true, in-depth understanding that you will get by actually learning each component of RL step-by-step, and most importantly, getting a chance to put everything into code!

This course covers:

• The explore-exploit dilemma and the Bayesian bandit method
• MDPs (Markov Decision Processes)
• Dynamic Programming solution for MDPs
• Monte Carlo Method
• Temporal Difference Method (including Q-Learning)
• Approximation Methods using Radial Basis Functions
• Applying your code to OpenAI Gym with zero effort / code changes
• Building a stock trading bot (different approach in each course!)

When you get the DeepLearningCourses.com version, note that you will get both versions (new and old) of the course – totalling nearly 20 hours of material.

If you want access to the tic-tac-toe project, this is the version you should get.

Otherwise, if you prefer to use Udemy, that’s fine too. If you purchase on Udemy but would like access to DeepLearningCourses.com, I will allow this since they are the same price. Just send me an email and show me your proof of purchase.

Note that I’m not able to offer the reverse (can’t give you access to Udemy if you purchase on DeepLerningCourses.com, due to operational reasons).

So what are you waiting for?

# Convert a Time Series Into an Image with Gramian Angular Fields and Markov Transition Fields

August 30, 2021

In my latest course (Time Series Analysis), I made subtle hints in the section on Convolutional Neural Networks that instead of using 1-D convolutions on 1-D time series, it is possible to convert a time series into an image and use 2-D convolutions instead.

CNNs with 2-D convolutions are the “typical” kind of neural network used in deep learning, which normally are used on images (e.g. ImageNet, object detection, segmentation, medical imaging and diagnosis, etc.)

In this article, we will look at 2 ways to convert a time series into an image:

1. Gramian Angular Field
2. Markov Transition Field

## Gramian Angular Field

The Gramian Angular Field is quite involved mathematically, so this article will discuss the intuition only, along with the code.

Those interesting in all the gory details are encouraged to read the paper, titled “Encoding Time Series as Images for Visual Inspection and Classification Using Tiled Convolutional Neural Networks” by Zhiguang Wang and Tim Oates.

We’ll build the intuition in a series of steps.

Let us begin by recalling that the dot product or inner product is a measure of similarity between two vectors.

$$\langle a, b\rangle = \lVert a \rVert \lVert b \rVert \cos \theta$$

Where $$\theta$$ is the angle between $$a$$ and $$b$$.

Ignoring the magnitude of the vectors, if the angle between them is small (i.e. close to 0) then the cosine of that angle will be nearly 1. If the angle is perpendicular, the cosine of the angle is 0. If the two vectors are pointing in opposite directions, then the cosine of the angle will be -1.

The Gram Matrix is just the repeated application of the inner product between every vector in a set of vectors, and every other vector in that same set of vectors.

i.e. Suppose that we store a set of column vectors in a matrix called $$X$$.

The Gram Matrix is:

$$G = X^TX$$

This expands to:

$$G = \begin{bmatrix} \langle x_1, x_1 \rangle & \langle x_1, x_2 \rangle & … & \langle x_1, x_N \rangle \\ \langle x_2, x_1 \rangle & \langle x_2, x_2 \rangle & … & \langle x_2, x_N \rangle \\ … & … & … & … \\ \langle x_N, x_1 \rangle & \langle x_N, x_2 \rangle & … & \langle x_N, x_N \rangle \end{bmatrix}$$

In other words, if we think of the inner product as the similarity between two vectors, then the Gram Matrix just gives us the pairwise similarity between every vector and every other vector.

Note that the Gramian Angular Field (GAF) does not apply the Gram Matrix directly (in fact, each value of the time series is a scalar, not a vector).

The first step in computing the GAF is to normalize the time series to be in the range [-1, +1].

Let’s assume we are given a time series $$X = \{x_1, x_2, …, x_N \}$$.

The normalized values are denoted by $$\tilde{x_i}$$.

The second step is to convert each value in the normalized time series into polar coordinates.

We use the following transformation:

$$\phi_i = \arccos \tilde{x_i}$$

$$r_i = \frac{t_i}{N}$$

Where $$t_i \in \mathbb{N}$$ represents the timestamp of data point $$x _i$$.

Finally, the GAF method defines its own “special” inner product as:

$$\langle x_1, x_2 \rangle = \cos(\phi_1 + \phi_2)$$

From here, the above formula for $$G$$ still applies (except using $$\tilde{X}$$ instead of $$X$$, and using the custom inner product instead of the usual version).

Here is an illustration of the process:

So why use the GAF?

Like the original Gram Matrix, it gives you a “picture” (no pun intended) of the relationship between every point and every other point in the time series.

That is, it displays the temporal correlation structure in the time series.

Here’s how you can use it in code.

Firstly, you need to install the pyts library. Then, run the following code on a time series of your choice:

Note that the library allows you to rescale the image with the image_size argument.

As an exercise, try using this method instead of the 1-D CNNs we used in the course and compare their performance!

## Markov Transition Field

The Markov Transition Field (MTF) is another method of converting a time series into an image.

The process is a bit simpler than that of the GAF.

If you have taken any of my courses which involve Markov Models (like Natural Language Processing, or HMMs) you should feel right at home.

Let’s assume we have an N-length time series.

We begin by putting each value in the time series into quantiles (i.e. we “bin” each value).

For example, if we use quartiles (4 bins), the smallest 25% of values would define the boundaries of the first quartile, the second smallest 25% of values would define the boundaries of the second quartile, etc.

We can think of each bin as a ‘state’ (using Markov model terminology).

Intuitively, we know that what we’d like to do when using Markov models is to form the state transition matrix.

This matrix has the values:

$$A_{ij} = P(s_t = j | s_{t-1} = i)$$

That is, $$A_{ij}$$ is the probability of transitioning from state i to state j.

As usual, we estimate this value by maximum likelihood. ( $$A_{ij}$$ is the count of transitions from i to j, divided by the total number of times we were in state i).

Note that if we have $$Q$$ quantiles (i.e. we have $$Q$$ “states”), then $$A$$ is a $$Q \times Q$$ matrix.

The MTF follows a similar concept.

The MTF (denoted by $$M$$) is an $$N \times N$$ matrix where:

$$M_{kl} = A_{q_k q_l}$$

And where $$q_k$$ is the quantile (“bin”) for $$x_k$$, and $$q_l$$ is the quantile for $$x_l$$.

Note: I haven’t re-used the letters i and j to index $$M$$, which most resources do and it’s super confusing.

Do not mix up the indices for $$M$$ and $$A$$! The indices in $$A$$ refer to states. The indices for $$M$$ are temporal.

$$A_{ij}$$ is the probability of transitioning from state i to state j.

$$M_{kl}$$ is the probability of a one-step transition from the bin for $$x_k$$, to the bin for $$x_l$$.

That is, it looks at $$x_k$$ and $$x_l$$, which are 2 points in the time series at arbitrary time steps $$k$$ and $$l$$.

$$q_k$$ and $$q_l$$ are the corresponding quantiles.

$$M_{kl}$$ is then just the probability that we saw a direct one-step (i.e. Markovian) transition from $$q_k$$ to $$q_l$$ in the time series.

So why use the MTF?

It shows us how related 2 arbitrary points in the time series are, relative to how often they appear next to each other in the time series.

Here’s how you can use it in code.

Note that the library allows you to rescale the image with the image_size argument.

As an exercise, try using this method instead of the 1-D CNNs we used in the course and compare their performance

Enjoy!

# Should you study the theory behind machine learning?

August 23, 2021

In this post, I want to discuss why you should not study the theory behind machine learning.

This may surprise some of you, since my courses can appear to be more “theoretical” than other ML courses on popular websites such as Udemy.

However, that is not the kind of “theory” I am talking about.

Most popular courses in ML don’t look at any math at all.

They are popular precisely for this reason: lack of math makes them accessible to the average Joe.

This does a disservice to you students, because you end up not having any solid understanding about how the algorithm works.

You may end up:

• doing things that don’t make sense, due to that lack of understanding.
• only being able to copy code from others, but not write any code yourself.
• not knowing how to apply algorithms to new kinds of data, without someone showing you how first.

For more discussion on that, see my post: “Why do you need math for machine learning and deep learning?

But let’s make this clear: math != theory.

When we look at math in my courses, we only look at the math needed to derive the algorithm and understand how it works at an intuitive level.

Yes, believe it or not, we are using math to improve our intuition.

This is despite what many beginners might think. When they see math, they automatically assume “math” = “not intuitive”, and that “intuitive” = “pictures, animations, and purposely avoiding math”.

That’s OK if you want to read a news article in the NY Times about ML, but not when you want to be a practitioner of ML.

Those are 2 different levels of “intuition” (layman vs. practitioner).

To see an extreme example of this, one need not look any further than Albert Einstein. Einstein was great at communicating his ideas to the public. Everyone can easily understand the layman interpretation of general relativity (mass bends space and time). But this is not the same as being a practitioner of relativistic physics.

Everyone has seen this picture and understands what it means at a high level. But does that mean you are a physicist or that you can “do physics”?

Anyway, that was just an aside so we don’t confuse “math used for intuition” and “layman intuition” and “theory”. These are 3 separate things. Just because you’re looking at some math, does not automatically imply you’re looking at “theory”.

What do we mean by “theory”?

Here’s a simple question to consider. Why does gradient descent work?

Despite the fact that we have used gradient descent in many of my courses, and derived the gradient descent update rules for neural networks, SVMs, and other models, we have never discussed why it works.

And that’s OK!

The “mathematical intuition” is enough.

But let’s get back to the question of this article: Why is the Lazy Programmer saying we should not study theory?

Well, this is the kind of “theory” that gets so deep, it:

• Does not produce any near-term gains in your work
• Requires a very high level of math ability (e.g. real analysis, optimization, dynamical systems)
• Is on the cutting-edge of understanding, and thus very difficult, likely to be disputed or even superseded in the near future

Case in point: although we have been using gradient descent for years in my courses (and decades before that in general), our understanding is still not yet complete.

Here’s an article that just came out this year on gradient descent (August 2021): “Computer Scientists Discover Limits of Major Research Algorithm“.

Here’s a direct link to the corresponding paper, called “The Complexity of Gradient Descent: CLS = PPAD ∩ PLS”: https://arxiv.org/abs/2011.01929

There will be more papers on these “theory” topics in the years to come.

My advice is not to go down this path, unless you really enjoy it, you are doing graduate research (e.g. PhD-level), you don’t mind if ideas you spent years and years working on might be proven incorrect, and you have a very high level of math ability in subjects like real analysis, optimization, and dynamical systems.

# Predicting Stock Prices with Facebook Prophet

August 3, 2021

Prophet is Facebook’s library for time series forecasting. It is mainly geared towards business datasets (e.g. predicting adspend or CPU usage), but a natural question that comes up with my students whenever we talk about time series is: “can it predict stock prices?”

In this article, I will discuss how to use FB Prophet to predict stock prices, and I’ll also show you what not to do (things I’ve seen in other popular blogs). Furthermore, we will benchmark the Prophet model with the naive forecast, to check whether or not one would really want to use this.

Note: This is an excerpt from my full VIP course, “Time Series Analysis, Forecasting, and Machine Learning“. If you want the code for this example, along with many, many other code examples on stock prices, sales data, and smartphone data, get the course!

The Prophet section will be part of the VIP version only, so get it now while the VIP coupon is still active!

## How does Prophet work?

The Prophet model is a 3 component, non-autoregressive time series model. Specifically:

$$y(t) = g(t) + s(t) + h(t) + \varepsilon(t)$$

The Prophet model is not autoregressive, like ARIMA, exponential smoothing, and the other methods we study in a typical time series course (including my own).

The 3 components are:

1) The trend $$g(t)$$ which can be either linear or logistic.

2) The seasonality $$s(t)$$, modeled using a Fourier series.

3) The holiday component $$h(t)$$, which is essentially a one-hot vector “dotted” with a vector of weights, each representing the contribution from their respective holiday.

## How to use Prophet for predicting stock prices

In my course, we do 3 experiments. Our data is Google’s stock price from approximately 2013-2018, but we only use the first 2 years as training data.

The first experiment is “plug-and-play” into Prophet with the default settings.

Here are the results:

Unfortunately, Prophet mistakenly believes there is a weekly seasonal component, which is the reason for the little “hairs” in the forecast.

When we plot the components of the model, we see that Prophet has somehow managed to find some weekly seasonality.

Of course, this is completely wrong! The model believes that the stock price increases on the weekends, which is highly unlikely because we don’t have any data for the weekend.

The second experiment is an example of what not to do. I saw this in every other popular blog, which is yet another “data point” that should convince you not to trust these popular data science blogs you find online (except for mine, obviously).

In this experiment, we set daily_seasonality to True in the model constructor.

Here are the results.

It seems like those weird little “hairs” coming from the weekly seasonal component have disappeared.

“The Lazy Programmer is wrong!” you may proclaim.

However, this is because you may not understand what daily seasonality really means.

Let’s see what happens when we plot the components.

This plot should make you very suspicious. Pay attention to the final chart.

“Daily seasonality” pertains to a pattern that repeats everyday with sub-daily changes.

This cannot be the case, because our data only has daily granularity!

Lesson: don’t listen to those “popular” blogs.

For experiment 3, we set weekly seasonality to False. Alternatively, you could try playing around with the priors.

Here are the results.

Notice that the “little hairs” are again not present.

## Is this model actually good?

Just because you can make a nice chart, does not mean you have done anything useful.

In fact, you see the exact same mistakes in those blog articles and terrible Udemy courses promising to “predict stock prices with LSTMs” (which I will call out every chance I get).

One of the major mistakes I see in nearly every blog post about predicting stock prices is that they don’t bother to compare it to a benchmark. And as you’ll see, the benchmark for stock prices is quite a low bar – there is no reason not to compare.

Your model is only useful if it can beat the benchmark.

For stock price predictions, the benchmark is typically the naive forecast, which is the optimal forecast for a random walk.

Random walks are often used as a model for stock prices since they share some common attributes.

For those unfamiliar, the naive forecast is simply where you predict the last-known value.

Example: If today’s price on July 5 is $200 and I want to make a forecast with a 5-day horizon, then I will predict$200 for July 6, $200 for July 7, …, and$200 for July 10.

I won’t bore you with the code (although it’s included in the course if you’re interested), but the answer is: Prophet does not beat the naive forecast.

In fact, it does not beat the naive forecast on any horizon I tried (5 days, 30 days, 60 days).

Sidenote: it’d be a good exercise to try 1 day as well.

Are stock prices really random walks? Although this particular example provides evidence supporting the random walk hypothesis, in my course, the GARCH section will provide strong evidence against it! Again, it’s all explained in my latest course, “Time Series Analysis, Forecasting, and Machine Learning“. Only the VIP version will contain the sections on Prophet, GARCH, and other important tools.

The VIP version is intended to be limited-time only, and the current coupon expires in less than one month!

Get your copy today while you still can.

# Why do you need math for machine learning and deep learning?

July 9, 2021

In this article, I will demonstrate why math is necessary for machine learning, data science, deep learning, and AI.

Most of my students have already heard this from me countless times. College-level math is a prerequisite for nearly all of my courses already.

Perhaps you may believe I am biased, because I’m the one teaching these courses which require all this math.

It would seem that I am just some crazy guy, making things extra hard for you because I like making things difficult.

WRONG.

You’ve heard it from me many times. Now you’ll hear it from others.

## Example #1

Let’s begin with one of the most famous professors in ML, Daphne Koller, who co-founded Coursera.

In this clip, Lex Fridman asks what advice she would have for those interested in beginning a journey into AI and machine learning.

One important thing she mentions, which I have seen time and time again in my own experience, is that those without typical prerequisite math backgrounds often make mistakes and do things that don’t make sense.

She’s being nice here, but I’ve met many of these folks who not only have no idea that what they are doing does not make sense, they also tend to be overly confident about it!

Then it becomes a burden for me, because I have to put in more effort explaining the basics to you just to convince you that you are wrong.

For that reason, I generally advise against hiring people for ML roles if they do not know basic math.

## Example #2

I enjoyed this strongly worded Reddit comment.

Original post:

Top comment:

## Example #3

Not exactly machine learning, but very related field: quant finance.

In fact, many students taking my courses dream about applying ML to finance.

Well, it’s going to be pretty hard if you can’t pass these interview questions.

http://www.math.kent.edu/~oana/math60070/InterviewProblems.pdf

Think about this logically: All quants who have a job can pass these kinds of interview questions. But you cannot. How well do you think you will do compared to them?

## Example #4

Entrepreneur and angel investor Naval Ravikant explains why deriving (what we do in all of my in-depth machine learning courses) is much more important than memorizing on the Joe Rogan Experience.

Most beginner-level Udemy courses don’t derive anything – they just tell you random facts about ML algorithms and then jump straight to the usual 3 lines of scikit-learn code. Useless!

# Time Series: How to convert AR(p) to VAR(1) and VAR(p) to VAR(1)

July 1, 2021

This is a very condensed post, mainly just so I could write down the equations I need for my Time Series Analysis course. 😉

However, it you find it useful – I am happy to hear that!

[Get 75% off the VIP version here]

$$y_t = b + \phi_1 y_{t-1} + \phi_2 y_{t-2} + \varepsilon_t$$

Suppose we create a vector containing both $$y_t$$ and $$y_{t -1}$$:

$$\begin{bmatrix} y_t \\ y_{t-1} \end{bmatrix}$$

We can write our AR(2) as follows:

$$\begin{bmatrix} y_t \\ y_{t-1} \end{bmatrix} = \begin{bmatrix} b \\ 0 \end{bmatrix} + \begin{bmatrix} \phi_1 & \phi_2 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} y_{t-1} \\ y_{t-2} \end{bmatrix} + \begin{bmatrix} \varepsilon_t \\ 0 \end{bmatrix}$$

Exercise: expand the above to see that you get back the original AR(2). Note that the 2nd line just ends up giving you $$y_{t-1} = y_{t-1}$$.

The above is just a VAR(1)!

You can see this by letting:

$$\textbf{z}_t = \begin{bmatrix} y_t \\ y_{t-1} \end{bmatrix}$$

$$\textbf{b}’ = \begin{bmatrix} b \\ 0 \end{bmatrix}$$

$$\boldsymbol{\Phi}’_1 = \begin{bmatrix} \phi_1 & \phi_2 \\ 1 & 0 \end{bmatrix}$$

$$\boldsymbol{\eta}_t = \begin{bmatrix} \varepsilon_t \\ 0 \end{bmatrix}$$.

Then we get:

$$\textbf{z}_t = \textbf{b}’ + \boldsymbol{\Phi}’_1\textbf{z}_{t-1} + \boldsymbol{\eta}_t$$

Which is a VAR(1).

Now let us try to do the same thing with an AR(3).

$$y_t = b + \phi_1 y_{t-1} + \phi_2 y_{t-2} + \phi_3 y_{t-3} + \varepsilon_t$$

We can write our AR(3) as follows:

$$\begin{bmatrix} y_t \\ y_{t-1} \\ y_{t-2} \end{bmatrix} = \begin{bmatrix} b \\ 0 \\ 0 \end{bmatrix} + \begin{bmatrix} \phi_1 & \phi_2 & \phi_3 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} y_{t-1} \\ y_{t-2} \\ y_{t-3} \end{bmatrix} + \begin{bmatrix} \varepsilon_t \\ 0 \\ 0 \end{bmatrix}$$

Note that this is also a VAR(1).

Of course, we can just repeat the same pattern for AR(p).

The cool thing is, we can extend this to VAR(p) as well, to show that any VAR(p) can be expressed as a VAR(1).

Suppose we have a VAR(3).

$$\textbf{y}_t = \textbf{b} + \boldsymbol{\Phi}_1 \textbf{y}_{t-1} + \boldsymbol{\Phi}_2 \textbf{y}_{t-2} + \boldsymbol{\Phi}_3 \textbf{y}_{t-3} + \boldsymbol{ \varepsilon }_t$$

Now suppose that we create a new vector by concatenating $$\textbf{y}_t$$, $$\textbf{y}_{t-1}$$, and $$\textbf{y}_{t-2}$$. We get:

$$\begin{bmatrix} \textbf{y}_t \\ \textbf{y}_{t-1} \\ \textbf{y}_{t-2} \end{bmatrix} = \begin{bmatrix} \textbf{b} \\ 0 \\ 0 \end{bmatrix} + \begin{bmatrix} \boldsymbol{\Phi}_1 & \boldsymbol{\Phi}_2 & \boldsymbol{\Phi}_3 \\ I & 0 & 0 \\ 0 & I & 0 \end{bmatrix} \begin{bmatrix} \textbf{y}_{t-1} \\ \textbf{y}_{t-2} \\ \textbf{y}_{t-3} \end{bmatrix} + \begin{bmatrix} \boldsymbol{\varepsilon_t} \\ 0 \\ 0 \end{bmatrix}$$

This is a VAR(1)!