May 25, 2022

# VIP Promotion

### The complete Transformers course has arrived

Hello friends!

Welcome to my latest course, Transformers for Natural Language Processing (NLP).

Link 2) https://www.udemy.com/course/data-science-transformers-nlp/?couponCode=TRANSFORMERSVIP (expires in 30 days – June 25, 2022!)

https://www.udemy.com/course/data-science-transformers-nlp/?couponCode=TRANSFORMERSVIP2

(expires July 26, 2022)

Transformers have changed deep learning immensely.

They’ve massively improved the state-of-the-art in all NLP tasks, like sentiment analysis, machine translation, question-answering, etc.

They’re even expanding their influence into other fields, such as computational biology and computer vision. DeepMind’s AlphaFold 2 has been said to “solve” a longstanding problem in molecular biology, known as protein structure prediction. Recently, DALL-E 2 demonstrated the ability to generate amazing art and photo-realistic images based only on simple text prompts. Imagine that – creating a realistic image out of just an idea!

Just within the past week, DeepMind introduced “Gato“, which is what they call a “generalist agent”, an AI that can do multiple things, like chat (i.e. do NLP!), play Atari games, caption images (i.e. computer vision!), manipulate a real, physical robot arm to stack blocks, and more!

Gato does all this by converting all the usual inputs from other domains into a sequence of tokens, so that they can be processed just like how we do in NLP. This is a great example of my oft-repeated rule, “all data is the same” (and also, another great reason to learn NLP since it would be a prerequisite to understanding this).

The course is split into 3 major parts:

1. Using Transformers (Beginner)
2. Fine-Tuning Transformers (Intermediate)
3. Transformers In-Depth (Expert – VIP only)

In part 1, you will learn how to use transformers which were trained for you. This costs millions of dollars to do, so it’s not something you want to try by yourself!

We’ll see how these prebuilt models can already be used for a wide array of tasks, including:

• text classification (e.g. spam detection, sentiment analysis, document categorization)
• named entity recognition
• text summarization
• machine translation
• generating (believable) text
• masked language modeling (article spinning)
• zero-shot classification

If you need to do sentiment analysis, document categorization, entity recognition, translation, summarization, etc. on documents at your workplace or for your clients – you already have the most powerful state-of-the-art models at your fingertips with very few lines of code.

One of the most amazing applications is “zero-shot classification”, where you will observe that a pretrained model can categorize your documents, even without any training at all.

In part 2, you will learn how to improve the performance of transformers on your own custom datasets. By using “transfer learning”, you can leverage the millions of dollars of training that have already gone into making transformers work very well.

You’ll see that you can fine-tune a transformer for many of the above tasks with relatively little work (and little cost).

In part 3 (the VIP sections), you will learn how transformers really work. The previous sections are nice, but a little too nice. Libraries are OK for people who just want to get the job done, but they don’t work if you want to do anything new or interesting.

Let’s be clear: this is very practical.

Well, this is where the big bucks are.

Those who have a deep understanding of these models and can do things no one has ever done before are in a position to command higher salaries and prestigious titles. Machine learning is a competitive field, and a deep understanding of how things work can be the edge you need to come out on top.

We’ll also look at how to implement transformers from scratch.

As the great Richard Feynman once said, “what I cannot create, I do not understand”.

NOTES:

• As usual, I wanted to get this course into your hands as early as possible! There are a few sections and lectures still in the works, including (but not limited to): fine-tuning for question-answering, more theory about transformers, and implementing transformers from scratch. As usual, I will update this post as new lectures are released.
• Everyone makes mistakes (including me)! Because this is such a large course, if I forgot anything (e.g. a Github link), just email me and let me know.
• Due to the way Udemy now works, if you purchase the course on deeplearningcourses.com, I cannot give you access to the Udemy version. It hasn’t always been this way, and Udemy has tended to make changes over the years that negatively impact both me and you, unfortunately.
• If you don’t know how “VIP courses” work, check out my post on that here. Short version: deeplearningcourses.com always houses all the content (both VIP and non-VIP). Udemy will house all the content initially, but the VIP content is removed later on.

So what are you waiting for? Get the VIP version of Transformers for Natural Language Processing NOW:

June 16, 2021

# VIP Promotion

### The complete Time Series Analysis course has arrived

Hello friends!

2 years ago, I asked the students in my Tensorflow 2.0 course if they’d be interested in a course on time series. The answer was a resounding YES.

Don’t want to read the rest of this little spiel? Just get the coupon:

https://www.udemy.com/course/time-series-analysis/?couponCode=TIMEVIP (note: this VIP coupon expires in 30 days!)

(Updated: Expires July 26, 2022) https://www.udemy.com/course/time-series-analysis/?couponCode=TIMEVIP13

Time series analysis is becoming an increasingly important analytical tool.

• With inflation on the rise, many are turning to the stock market and cryptocurrencies in order to ensure their savings do not lose their value.
• COVID-19 has shown us how forecasting is an essential tool for driving public health decisions.
• Businesses are becoming increasingly efficient, forecasting inventory and operational needs ahead of time.

Let me cut to the chase. This is not your average Time Series Analysis course. This course covers modern developments such as deep learning, time series classification (which can drive user insights from smartphone data, or read your thoughts from electrical activity in the brain), and more.

We will cover techniques such as:

• ETS and Exponential Smoothing
• Holt’s Linear Trend Model
• Holt-Winters Model
• ARIMA, SARIMA, SARIMAX, and Auto ARIMA
• ACF and PACF
• Vector Autoregression and Moving Average Models (VAR, VMA, VARMA)
• Machine Learning Models (including Logistic Regression, Support Vector Machines, and Random Forests)
• Deep Learning Models (Artificial Neural Networks, Convolutional Neural Networks, and Recurrent Neural Networks)
• GRUs and LSTMs for Time Series Forecasting

We will cover applications such as:

• Time series forecasting of sales data
• Time series forecasting of stock prices and stock returns
• Time series classification of smartphone data to predict user behavior

The VIP version of the course (obtained by purchasing the course NOW during the VIP period) will cover even more exciting topics, such as:

• AWS Forecast (Amazon’s state-of-the-art low-code forecasting API)
• GARCH (financial volatility modeling)
• FB Prophet (Facebook’s time series library)
• Granger Causality

As always, please note that the VIP period may not last forever, and if / when the course becomes “non-VIP”, the VIP contents will be removed. If you purchased the VIP version, you will retain permanent access to the VIP content via my website, simply by letting me know via email you’d like access (you only need to email if I announce the VIP period is ending).

So what are you waiting for? Get the VIP version of Time Series Analysis NOW:

# NEW COURSE: Financial Engineering and Artificial Intelligence in Python

September 8, 2020

# VIP Promotion

### The complete Financial Engineering course has arrived

Hello once again friends!

Today, I am announcing the VIP version of my latest course: Financial Engineering and Artificial Intelligence in Python.

https://www.udemy.com/course/ai-finance/?couponCode=FINANCEVIP (expires Oct 9, 2020)

https://www.udemy.com/course/ai-finance/?couponCode=FINANCEVIP22 (expires July 26, 2022)

(as usual, this coupon lasts only 30 days, so don’t wait!)

This is a MASSIVE (21 hours) Financial Engineering course covering the core fundamentals of financial engineering and financial analysis from scratch. We will go in-depth into all the classic topics, such as:

• Exploratory data analysis, significance testing, correlations
• Alpha and beta
• Advanced Pandas Data Frame manipulation for time series and finance
• Time series analysis, simple moving average, exponentially-weighted moving average
• Holt-Winters exponential smoothing model
• ARIMA and SARIMA
• Efficient Market Hypothesis
• Random Walk Hypothesis
• Time series forecasting (“stock price prediction”)
• Modern portfolio theory
• Efficient frontier / Markowitz bullet
• Mean-variance optimization
• Maximizing the Sharpe ratio
• Convex optimization with Linear Programming and Quadratic Programming
• Capital Asset Pricing Model (CAPM)

In addition, we will look at various non-traditional techniques which stem purely from the field of machine learning and artificial intelligence, such as:

• Regression models
• Classification models
• Unsupervised learning
• Reinforcement learning and Q-learning

We will learn about the greatest flub made in the past decade by marketers posing as “machine learning experts” who promise to teach unsuspecting students how to “predict stock prices with LSTMs”. You will learn exactly why their methodology is fundamentally flawed and why their results are complete nonsense. It is a lesson in how not to apply AI in finance.

### List of VIP-only Contents

As with my Tensorflow 2 release, some of the VIP content will be a surprise and will be released in stages. Currently, the entirety of the Algorithmic Trading sections are VIP sections. Newly added VIP sections include Statistical Factor Models and “The Lazy Programmer Bonus Offer”. Here’s a full list:

Classic Algorithmic Trading – Trend Following Strategy

You will learn how moving averages can be applied to do algorithmic trading.

Forecast returns in order to determine when to buy and sell.

I give you a full introduction to Reinforcement Learning from scratch, and then we apply it to build a Q-Learning trader. Note that this is *not* the same as the example I used in my Tensorflow 2, PyTorch, and Reinforcement Learning courses. I think the example included in this course is much more principled and robust.

Statistical Factor Models

The CAPM is one of the most renowned financial models in history, but did you know it’s only the simplest factor model, with just a single factor? To go beyond just this single factor model, we will learn about statistical factor models, where the multiple “factors” are found automatically using only the data.

Regime Detection with Hidden Markov Models (HMMs)

In the first section on financial basics, we learn how to model the distribution of returns. But can we really say “the” distribution, as if there is only one?

One important “stylized fact” about returns is that volatility “clusters” or “persists”. That is, large returns tend to be surrounded by more large returns, and small returns by more small returns.

In other words, returns are actually nonstationary and to build a more accurate model we should not assume that they all come from the same distribution at all times.

Using HMMs, we can model this behavior. HMMs allow you to model hidden state sequences (high volatility and low volatility regimes), from which observations (the actual returns) are generated.

The Lazy Programmer Bonus Offer

There are marketers out there who want to capitalize on your enthusiastic interest in finance, and unfortunately what they are teaching you is utter and complete garbage.

They will claim that they can “predict stock prices with LSTMs” and show you charts like this with nearly perfect stock price predictions.

Hint: if they can do this, why do they bother putting effort into making courses? Wouldn’t they already be billionaires?

Have you ever wondered if you are taking such a course from a fake data scientist / marketer? If so, just send me a message, and I will tell you whether or not you are taking such a course. (Hint: many of you are) I will give you a list of mistakes they made so you can look out for them yourself, and avoid “learning” things which will ultimately make YOU look very bad in front of potential future employers.

Believe me, if you ever try to get a job in machine learning or data science and you talk about a project where you “predicted stock prices with LSTMs”, all you will be demonstrating is how incompetent you are. I don’t want to see any of my students falling for this! Save yourself from this embarrassing scenario by taking the “Lazy Programmer Offer”!

Please note: The VIP coupon will work only for the next month (starting from the coupon creation time). It’s unknown whether the VIP period will renew after that time.

After that, although the VIP content will be removed from Udemy, all who purchased the VIP course will get permanent free access to these VIP contents on deeplearningcourses.com.

In case it’s not clear, the process is very easy. For those folks who want the “step-by-step” instructions:

STEP 1) I announce the VIP content will be removed.

STEP 2) You email me with proof that you purchased the course during the VIP period. Do NOT email me earlier as it will just get buried.

STEP 3) I will give you free access to the VIP materials for this course on deeplearningcourses.com.

### Benefits of taking this course

• Learn the knowledge you need to work at top tier investment firms
• Gain practical, real-world quantitative skills that can be applied within and outside of finance
• Make better decisions regarding your own finances

Personally, I think this is the most interesting and action-packed course I have created yet. My last few courses were cool, but they were all about topics which I had already covered in the past! GANs, NLP, Transfer Learning, Recommender Systems, etc etc. all just machine learning topics I have covered several times in different libraries. This course contains new, fresh content and concepts I have never covered in any of my courses, ever.

This is the first course I’ve created that extends into a niche area of AI application. It goes outside of AI and into domain expertise. An in-depth topic such as finance deserves its own course. This is that course. These are topics you will never learn in a generic data science or machine learning course. However, as a student of AI, you will recognize many of our tools and methods being applied, such as statistical inference, supervised and unsupervised learning, convex optimization, and optimal control. This allows us to go deeper than your run of the mill financial engineering course, and it becomes more than just the sum of its parts.

So what are you waiting for?

April 1, 2020

# VIP Promotion

### The complete PyTorch course has arrived

Hello friends!

I hope you are all staying safe. Well, I’m sure you’ve heard enough about that so how about some different news?

Today, I am announcing the VIP version of my latest course: PyTorch: Deep Learning and Artificial Intelligence

https://www.udemy.com/course/pytorch-deep-learning/?couponCode=PYTORCHVIP

https://www.udemy.com/course/pytorch-deep-learning/?couponCode=PYTORCHVIP27 (expires July 26, 2022)

This is a MASSIVE (over 24 hours) Deep Learning course covering EVERYTHING from scratch. That includes:

• Machine learning basics (linear neurons)
• ANNs, CNNs, and RNNs for images and sequence data
• Time series forecasting and stock predictions (+ why all those fake data scientists are doing it wrong)
• NLP (natural language processing)
• Recommender systems
• Transfer learning for computer vision
• Deep reinforcement learning and applying it by building a stock trading bot

IN ADDITION, you will get some unique and never-before-seen VIP projects:

Estimating prediction uncertainty

Drawing the standard deviation of the prediction along with the prediction itself. This is useful for heteroskedastic data (that means the variance changes as a function of the input). The most popular application where heteroskedasticity appears is stock prices and stock returns – which I know a lot of you are interested in.

It allows you to draw your model predictions like this:

Sometimes, the data is simply such that a spot-on prediction can’t be made. But we can do better by letting the model tell us how certain it is in its predictions.

Facial recognition with siamese networks

This one is cool. I mean, I don’t have to tell you how big facial recognition has become, right? It’s the single most controversial technology to come out of deep learning. In the past, we looked at simple ways of doing this with classification, but in this section I will teach you about an architecture built specifically for facial recognition.

You will learn how this can work even on small datasets – so you can build a network that recognizes your friends or can even identify all of your coworkers!

You can really impress your boss with this one. Surprise them one day with an app that calls out your coworkers by name every time they walk by your desk. 😉

Please note: The VIP coupon will work only for the next month (ending May 1, 2020). It’s unknown whether the VIP period will renew after that time.

After that, although the VIP content will be removed from Udemy, all who purchased the VIP course will get permanent free access on deeplearningcourses.com.

## Minimal Prerequisites

This course is designed to be a beginner to advanced course. All that is required is that you take my free Numpy prerequisites to learn some basic scientific programming in Python. And it’s free, so why wouldn’t you!?

You will learn things that took me years to learn on my own. For many people, that is worth tens of thousands of dollars by itself.

There is no heavy math, no backpropagation, etc. Why? Because I already have courses on those things. So there’s no need to repeat them here, and PyTorch doesn’t use them. So you can relax and have fun. =)

## Why PyTorch?

All of my deep learning courses until now have been in Tensorflow (and prior to that Theano).

So why learn PyTorch?

Does this mean my future deep learning courses will use PyTorch?

In fact, if you have traveled in machine learning circles recently, you will have noticed that there has been a strong shift to PyTorch.

Case in point: OpenAI switched to PyTorch earlier this year (2020).

Major AI shops such as Apple, JPMorgan Chase, and Qualcomm have adopted PyTorch.

PyTorch is primarily maintained by Facebook (Facebook AI Research to be specific) – the “other” Internet giant who, alongside Google, have a strong vested interest in developing state-of-the-art AI.

But why PyTorch for you and me? (aside from the fact that you might want to work for one of the above companies)

As you know, Tensorflow has adopted the super simple Keras API. This makes common things easy, but it makes uncommon things hard.

With PyTorch, common things take a tiny bit of extra effort, but the upside is that uncommon things are still very easy.

Creating your own custom models and inventing your own ideas is seamless. We will see many examples of that in this course.

For this reason, it is very possible that future deep learning courses will use PyTorch, especially for those advanced topics that many of you have been asking for.

Because of the ease at which you can do advanced things, PyTorch is the main library used by deep learning researchers around the world. If that’s your goal, then PyTorch is for you.

In terms of growth rate, PyTorch dominates Tensorflow. PyTorch now outnumbers Tensorflow by 2:1 and even 3:1 at major machine learning conferences. Researchers hold that PyTorch is superior to Tensorflow in terms of the simplicity of its API, and even speed / performance!

Do you need more convincing?

March 18, 2020

# VIP Promotion

Hello all!

In this post, I am announcing the VIP coupon to my course titled “Artificial Intelligence: Reinforcement Learning in Python”.

There are 2 places to get the course.

1. Udemy, with this VIP coupon: https://www.udemy.com/course/artificial-intelligence-reinforcement-learning-in-python/?couponCode=REINFORCEVIP13 (expires July 26, 2022)
2. Deep Learning Courses (coupon automatically applied): https://deeplearningcourses.com/c/artificial-intelligence-reinforcement-learning-in-python

You may recognize this course as one that has already existed in my catalog – however, the course I am announcing today contains ALL-NEW material. The entire course has been gutted and every lecture contained within the course did not exist in the original version.

One of the most common questions I get from students in my PyTorch, Tensorflow 2, and Financial Engineering courses is: “How can I learn reinforcement learning?”

While I do cover RL in those courses, it’s very brief. I’ve essentially summarized 12 hours of material into 2. So by necessity, you will be missing some things.

While that serves as a good way to scratch the surface of RL, it doesn’t give you a true, in-depth understanding that you will get by actually learning each component of RL step-by-step, and most importantly, getting a chance to put everything into code!

This course covers:

• The explore-exploit dilemma and the Bayesian bandit method
• MDPs (Markov Decision Processes)
• Dynamic Programming solution for MDPs
• Monte Carlo Method
• Temporal Difference Method (including Q-Learning)
• Approximation Methods using Radial Basis Functions
• Applying your code to OpenAI Gym with zero effort / code changes
• Building a stock trading bot (different approach in each course!)

When you get the DeepLearningCourses.com version, note that you will get both versions (new and old) of the course – totalling nearly 20 hours of material.

If you want access to the tic-tac-toe project, this is the version you should get.

Otherwise, if you prefer to use Udemy, that’s fine too. If you purchase on Udemy but would like access to DeepLearningCourses.com, I will allow this since they are the same price. Just send me an email and show me your proof of purchase.

Note that I’m not able to offer the reverse (can’t give you access to Udemy if you purchase on DeepLerningCourses.com, due to operational reasons).

So what are you waiting for?

# How to Speak by Patrick Winston

May 30, 2022

Making a post on this for posterity. A student sent this to me the other day and I thought it was great.

I could probably apply some of this to my courses too!

# Become a Millionaire by Taking my Financial Engineering Course

May 17, 2022

I just got an excellent question today about my Financial Engineering course, which allowed me to put into words many thoughts and ideas I’d been pondering recently.

Through this post, I hope to get all these ideas into one place for future reference.

The question was: “How practical is this course? I’ve skimmed through several top ratings on Udemy but have yet seen one boasting how much money the student made after taking it

Will you become a millionaire after taking my financial engineering course?

Let’s answer this question by starting with my own definition of “practical”, and then subsequently addressing the student’s definition of practical which appears to mean “making money”.

In my view, “practical” simply means you’re applying knowledge to a real-world dataset.

For example, my Recommender Systems course is practical because you apply the algorithms we learn to real-world ratings datasets.

My Bayesian Machine Learning: A/B Testing course is practical because you can apply the algorithms to any business scenario where you have to decide between multiple choices based on some numerical objective (e.g. clicks, page view time, etc.)

In the same way, the Financial Engineering course is extremely practical, because the whole course is about applying algorithms to real-world financial datasets. The application is a real-world problem.

This is unlike, say, reading Pattern Recognition and Machine Learning by Bishop, which is all about the algorithms and not the fields of application. The implication is that, you know what you’re doing and can take those algorithms and apply them to your own data.

On one hand, that’s powerful – because you can apply these algorithms to any field (like biology, astronomy, chemistry, robotics, control systems, and yes, finance), but at the same time, you have to be pretty smart to do it. The average Udemy student would struggle.

In that sense, this is the most practical you can get. Everything you learn in this course is being directly applied to real-world data in a specific field (finance).

You can grab one of the algorithms taught in the course and start using it today on your own investing account. There’s a lecture about that in the Summary section called “Applying This Course” for those who need extra help.

Importantly, do keep in mind that while I can teach you what to do, I can’t actually make you do it.

In A/B Testing, I can show you the code, but the rest is up to the student to make it practical, by actually getting a job where they get to do that in a production system, or by inserting the code into their own production website so they can feed it to live users.

Funny enough, A/B Testing isn’t even about finance nor money. But will you make money with those techniques? YES. Amazon, Facebook, Netflix, etc. are already using the same techniques with great success.

The only reason some students might say it’s not practical is because they are too lazy/incompetent to get off their butts and actually do it!

Same here. I can teach the algorithms, but I can’t go into your brokerage account and run them for you.

Now let’s consider the definition of “practical” in the sense of being guaranteed to “make money”.

This is a common concern among students who are new to finance and don’t really know yet what to expect.

Let’s suppose I could guarantee that by taking this course, you could make money.

Consider some obvious questions:

• If this were true, anyone (including myself) would just scale it up and become extremely wealthy without doing any work. Clearly, no such thing exists (that is public and that we know of).
• If this were true, why would anyone work? Financial engineering graduates wouldn’t bother to apply for jobs, they would just run algorithms all day. They would teach their friends / family to do the same. No one would ever bother to get a job.
• If this were true, why would hedge funds bother to hire employees? After inventing an algorithm, they could just run it forever. What’s the point of wasting money to hire humans? What would they even do?
• If this were true, why would hedge funds bother to hire PhDs and why would people bother to get PhDs? Imagine you could increase your investments infinitely from a 20 hour online course. What kind of insane person would work for 4-7 years just to get a pittance and a paper that says “PhD”?

On the contrary, the reality is this.

The financial sector does hire very smart people and it is well-known that they have poor work-life balance.

They must be working hard. What are they doing?

Why can’t they just learn an algorithm and sit back and relax?

Instead, let’s expand the definition of “practical”.

Originally, this question was asked in a comment on a video I made about predicting stock prices with LSTMs. Is this video practical? YES. If you didn’t know this, you could have spent weeks / months / maybe even your whole life trying to “predict stock prices with LSTMs”, with zero clue that it didn’t actually work. That would be sad.

Spending weeks or months doing something that doesn’t even make sense is what I would consider to be very impractical. And hence, learning how to avoid it would be very practical.

A lot of the course is about how to properly model and analyze. How to stay away from stupidity.

One of the major themes of the course is that “Santa Claus doesn’t exist”.

A naive person might think “there must be some way to predict the stock price, you are just not telling me about the most advanced algos!”

But the “Santa Claus doesn’t exist” moment is when we prove mathematically why certain predictions are impossible.

This is practical because it saves you from attempting something which doesn’t make any logical sense.

Obviously, it doesn’t fulfill the childhood dream of meeting Santa (predicting an unpredictable time series), but I would posit that trying to meet Santa is what is really impractical.

What is actually practical is learning how to determine whether you can or cannot predict a time series (at which point, you can then make your predictions as normal).

I’ll give you another example lesson.

If you used the simplest trading strategy from this course, you could have beat the market from 2000 – 2018.

Using the same algorithm, you would have underperformed the market from 2018 to now.

The practical lesson there is that “past performance doesn’t indicate future performance”.

This is how you can have a “practical” lesson, which doesn’t automatically imply “guaranteed rate of return” (which is impossible).

Addendum: actually, it is possible to guarantee a rate of return. Just purchase a fixed-income security like a CD (certificate of deposit) at your bank. The downside is that the rate of return is very low. This is yet another practical lesson from the course – the tradeoff between risk and reward and how real-world entities automatically adjusts themselves to match present conditions. In other words, you’ll never find a zero-risk asset that guarantees 1000x returns. Why is this practical? Again, you want to avoid wasting time searching for that which does not exist.

# Machine Learning in Finance by Dixon, Halperin, Bilokon – A Critique

May 16, 2022

Check out the video version of this post on YouTube:

In this post, I’m going to write about one of my all-time favorite subjects: the wrong way to predict stock and cryptocurrency prices.

It’s not everyday I get to critique a published book by a big name like Springer.

The book I’m referring to is called “Machine Learning in Finance: From Theory to Practice”, by Matthew Dixon, Igor Halperin, and Paul Bilokon.

Now you might think I’m beating a dead horse with this video, which is kind of true.

I’ve already spoken at length about the many mistakes people make when trying to predict stock prices.

But there are a few key differences with this video.

Firstly, in past videos, I’ve mentioned that it is typically bloggers and marketers who put out this bad content.

This time, it’s not a blogger or marketer, but an Assistant Professor of Applied Math at the Illinois Institute of Technology.

Secondly, while I’ve spoken about what the mistakes are, I’ve never done a case study where I’ve broken down actual code that makes these mistakes.

This is the first.

Thirdly, in my opinion, this is the most important topic to cover for beginners to finance, because it’s always the first thing people try to do. They want to predict future prices so they know what to invest in today.

If you take my course on Financial Engineering, you’ll learn that this is completely untrue. Price prediction barely scratches the surface of true finance.

In order to get the code I’ve used in this video, please use this link: https://bit.ly/3yCER6S

Note that it’s a copy of the code provided with the textbook, with added code for my own experiments (computing the naive forecast and the corresponding train / test MSE).

I also removed code for a different type of RNN called the “alpha RNN”, which uses an old version of Keras. Removing this code doesn’t make a difference in our results because this model didn’t perform well.

The mistakes I’ll cover in this post are as follows.

1) They only standardize the price time series, which does nothing about the problem of extrapolation.

2) They never check whether their model can beat the naive forecast. Spoiler alert. I checked, and it doesn’t. The models they built are worse than useless.

So let’s talk about mistake #1, which is why standardizing a price time series does not work.

The problem with prices is that they are ever increasing. This wasn’t the case for the time period used in the textbook, but it is the case in general.

Why is this an issue?

The train set is always in the past, and the test set is always in the future.

Therefore, the values in the test set in general will be higher than the values in the train set.

If you build an autoregressive model based on this data, your model will have to extrapolate to a domain never seen before in the train set.

This is not good, because machine learning models suck at extrapolation.

How they extrapolate has more to do with the model itself, than it has to do with the data.

We analyzed this phenomena in my course on time series analysis.

For instance, decision trees tend to extrapolate by going horizontally outward.

Neural networks, Gaussian Processes, and other models all behave differently, and none of these behaviors are related to the data.

Mistake #2, which is the worst mistake, is that the authors never check against the naive forecast.

As you recall, the naive forecast is when your prediction is simply the past known value.

In their notebook, the authors predict 4 time steps ahead.

So effectively, our naive prediction is the price from 4 time steps in the past.

Even this very dumb prediction beats their fancy RNN models. Surprisingly, this happens not just for the test set, but the train set as well.

Mistake #3 is the misleading train-test split.

In the notebook, the authors make a plot of their models’ predictions against the true price.

Of course, the error looks very small and very close to the true price in all cases.

But remember that this is misleading. It doesn’t tell you that these models actually suck.

In time series analysis, when we think of a test set, we normally think of it as the forecast horizon.

Instead, the forecast horizon is actually 4 time steps, and the plot actually just shows the incremental predictions at each time step using true past data.

To be clear, although this is not a forecast, it’s also not technically wrong, but it’s still misleading and totally useless for evaluating the efficacy of these models.

As we saw from mistake #2, even just the naive forecast beats these models, which you wouldn’t know from these seemingly good plots.

So I hope this post serves as a good lesson that you always have to be careful about how you apply machine learning in finance.

Even big name publishers like Springer, and reputable authors who might even be college professors, are not immune to these mistakes.

Don’t trust everything you see, and always experiment and stress test any claims.

# Using Granger Causality to Determine Whether Twitter Sentiment Predicts Bitcoin Price Movement

February 22, 2022

In this article, we are again going to combine my current favorite subjects: natural language processing, time series analysis, and financial analysis.

Recently, I created a couple lectures covering Granger causality, so this topic is fresh on my mind.

In short, Granger causality is used to determine whether one time series can be used to forecast another (i.e. predict the future).

In these lectures, I demonstrated that some economics variables are Granger causal (in particular, GDP and term spread).

Of course, another easy application is to determine whether or not Twitter sentiment can predict cryptocurrency movements.

This post is based on this short publication: “Does Twitter Predict Bitcoin?” by Shen, D., Urquhart, A. and Wang, P. (2019) and can be found at https://centaur.reading.ac.uk/80420/1/Twitter.Bitcoin.pdf

The premise is quite simple and you really have to just understand these 3 components in order to implement this yourself:

1) How to get a Twitter sentiment time series

2) How to get Bitcoin price time series

3) How to implement the Granger causality test

If you can do 1-3, you can predict Bitcoin! (at least, partially)

So let’s go over each of these 3 topics in order.

### How to get a Twitter sentiment time series

This is going to probably be the most difficult part for most students. Most students are used to downloading a CSV dataset that I typically make very nice and simple for my courses.

Unfortunately, real life is not like this.

This becomes a data engineering problem.

Which tweets by which authors do you choose?

Where do you store the tweets?

Once you’ve figured that out, you need to convert the tweets into a number (sentiment) such that the numbers collectively form a time series.

That part is not so hard.

I’ve demonstrated several methods of doing this, such as:

a) training your own model on sentiment data (you could even create your own dataset)

b) using a pretrained Transformer model

### How to get Bitcoin price time series

In contrast to the first task, this is probably the easiest.

In the past, I’ve demonstrated how you can easily get minute, daily, monthly, etc. data for essentially any ticker using the yfinance Python package.

### How to implement the Granger causality test

For those of you who haven’t learned Time Series Analysis with me in the past, you perhaps have never heard of Granger causality.

In short, we build a multivariate autoregressive time series model called a VAR model.

It takes the form of:

$$y(t) = \sum_{\tau=1}^L A_\tau y(t-\tau) + \varepsilon(t)$$

Essentially, if you find any component $$A_\tau(j,i)$$ is “big enough” (in magnitude), then you can conclude that $$y_i(t)$$ Granger causes $$y_j(t)$$.

As in regression analysis, one decides whether these model coefficients are statistically significant by using hypothesis testing.

It’s important to note that Granger causality is not “true” causality as one usually thinks of it (e.g. eating food causes me to be satiated). Granger causal simply means that one time series is useful in forecasting another (hence the cross-coefficients being non-zero).

Luckily, the Granger causality test is very easy to use in Python with the statsmodels package.

Suppose you have your 2 time series (BTC returns and Twitter sentiment) in a 2-column dataframe (sidenote: your time series should be stationary so you should use returns and not prices).

Then you simply call the statsmodels function:

This will output p-values for every lag so you can see whether or not the sentiment at that particular lag affects the BTC return.

Final note: unfortunately, the paper only shows that Twitter sentiment Granger causes some function of the squared return. This means we lose information about whether the return is actually going up or down!