How much math knowledge do you need for machine learning and deep learning? Some people say not much. Others say a lot. Both are correct, depending on what you want to achieve.
There are plenty of programming libraries, code snippets, and pretrained models that can get help you integrate machine learning into your applications without having a deep knowledge of the underlying math functions.
But there’s no escaping the mathematical foundations of machine learning. At some point in your exploration and mastering of artificial intelligence, you’ll need to come to terms with the lengthy and complicated equations that adorn AI whitepapers and machine learning textbooks.
In this post, I will introduce some of my favorite machine learning math resources. And while I don’t expect you to have fun with machine learning math, I will also try my best to give you some guidelines on how to make the journey a bit more pleasant.
Start with the basics
Many machine learning books tell you that having a working knowledge of linear algebra. I would argue that you need a lot more than that. Extensive experience with linear algebra is a must-have—machine learning algorithms squeeze every last bit out of vector spaces and matrix mathematics.
You also need to know a good bit of statistics and probability, as well as differential and integral calculus, especially if you want to become more involved in deep learning.
There are plenty of good textbooks, online courses, and blogs that explore these topics. But my personal favorite is Khan Academy’s math courses. Sal Khan has done a great job of putting together a comprehensive collection of videos that explain different math topics. And it’s free, which makes it even better.
Although each of the videos (which are also available on YouTube) explain a separate topic, going through the courses end-to-end provides a much richer experience.
I recommend the linear algebra course in particular. Here, you’ll find everything you need about vector spaces, linear transformations, matrix transformations, and coordinate systems. The course has not been tailored for machine learning, and many of the examples are about 2D and 3D graphic systems, which are much easier to visualize than the multidimensional spaces of machine learning problems. But they discuss the same concepts you’ll encounter in machine learning books and whitepapers. In the course are some hidden gems like least square calculations and eigenvectors, which are important topics in machine learning.
The calculus course are a bit more fragmented, but it might be a good feature for readers who already have a strong foundation and just want to brush up their skills. Khan includes precalculus, differential calculus, and integral calculus courses that cover the foundations. The multivariable calculus course discusses some of the topics that are central to deep learning, such as gradient descent and partial derivatives.
There are also several statistics courses in Khan Academy’s platform, and there are some overlaps between them. They all discuss some of the key concepts you need in data science and machine learning, such as random variables, distributions, confidence intervals, and the difference between continuous and categorical data. I recommend the college statistics course, which includes some extra material that is relevant to machine learning, such as the Bayes theorem.
To be clear, Khan Academy’s courses are not a replacement for the math textbook and classroom. They are not very rich in exercises. But they are very rich in examples, and for someone who just needs to blow the dust off their algebra knowledge, they’re great. Sal talks very slowly, probably to make the videos usable for a wider audience who are not native English speakers. I run the videos on 1.5x speed and have no problem understanding them, so don’t let the video lengths taunt you.
Specialized books and courses on machine learning math
Vanilla algebra and calculus are not enough to get comfortable with the mathematics of machine learning. Machine learning concepts such as loss functions, learning rate, activation functions, and dimensionality reduction are not covered in classic math books. There are more specialized resources for that.
My favorite is Mathematics for Machine Learning. Written by three AI researchers, the provides you with a strong foundation to explore the workings of different components of machine learning algorithms.
The book is split into two parts. The first part is mathematical foundations, which is basically a revision of key linear algebra and calculus concepts. The authors cover a lot of material in little more than 200 pages, so most of it is skimmed over with one or two examples. If you have a strong foundation, this part will be a pleasant read. If you find it hard to grasp, you can combine the chapters with select videos from Khan’s YouTube channel. It’ll become much easier.
The second part of the book focuses on machine learning mathematics. You’ll get into topics such as regression, dimensionality reduction, support vector machines, and more. There’s no discussion of artificial neural networks and deep learning concepts, but being focused on the basics makes this book a very good introduction to the mathematics of machine learning.
As the authors write on their website: “The book is not intended to cover advanced machine learning techniques because there are already plenty of books doing this. Instead, we aim to provide the necessary mathematical skills to read those other books.”
For a more advanced take on deep learning, I recommend Hands-on Mathematics for Deep Learning. This book also contains an intro on linear algebra, calculus, and probability and statistics. Again, this section is for people who just want to jar their memory. It’s not a basic introductory book.
The real value of this book comes in the second section, where you go into the mathematics of multilayer perceptrons, convolutional neural networks (CNN), and recurrent neural networks (RNN). The book also goes into the logic of other crucial concepts such as regularization (L1 and L2 norm), dropout layers, and more.
These are concepts that you’ll encounter in most books on machine learning and deep learning. But knowing the mathematical foundations will help you better understand the role hyperparameters play in improving the performance of your machine learning models.
A bonus section dives into advanced deep learning concepts, such as the attention mechanism that has made Transformers so efficient and popular, generative models such as autoencoders and generative adversarial networks, and the mathematics of transfer learning.
When should you learn machine learning mathematics?
Agreeably, mathematics is not the most fun way to start machine learning education, especially if you’re self-learning. Fortunately, as I said at the beginning of this article, you don’t need to begin your machine learning education by poring over double integrals, partial derivatives, and mathematical equations that span a page’s width.
You can start with some of the more practical resources on data science and machine learning. A good introductory book is Principles of Data Science, which gives you a good overview of data science and machine learning fundamentals along with hands-on coding examples in Python and light mathematics. Hands-on Machine Learningand Python Machine Learning are two other books that are a little more advanced and also give deeper coverage of the mathematical concepts. Udemy’s Machine Learning A-Z is an online course that combines coding with visualization in a very intuitive way.
I would recommend starting with one or two of the above-mentioned books and courses. They will give you a working knowledge of the basics of machine learning and deep learning and prepare your mind for the mathematical foundations. Once you know have a solid grasp of different machine learning algorithms, learning the mathematical foundations becomes much more pleasant.
As you master the mathematics of machine learning, you will find it easier to find new ways to optimize your models and tweak them for better performance. You’ll also be able to read the latest cutting edge papers that explain the latest findings and techniques in deep learning, and you’ll be able to integrate them into your applications. In my experience, the mathematics of machine learning is an ongoing educational experience. Always look for new ways to hone your skills.