I have always emphasized on the importance of mathematics in machine learning.

Here is a compilation of resources (books, videos & papers) to get you going.

(Note: It's not an exhaustive list but I have carefully curated it based on my experience and observations)
📘 Mathematics for Machine Learning

by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong

https://mml-book.github.io/ 

Note: this is probably the place you want to start. Start slowly and work on some examples. Pay close attention to the notation and get comfortable with it.
📘 Pattern Recognition and Machine Learning

by Christopher Bishop

Note: Prior to the book above, this is the book that I used to recommend to get familiar with math-related concepts used in machine learning. A very solid book in my view and it's heavily referenced in academia.
📘 The Elements of Statistical Learning

by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie

Mote: machine learning deals with data and in turn uncertainty which is what statistics teach. Get comfortable with topics like estimators, statistical significance,...
📘 Probability Theory: The Logic of Science

by E. T. Jaynes

Note: In machine learning, we are interested in building probabilistic models and thus you will come across concepts from probability theory like conditional probability and different probability distributions.
📺 Multivariate Calculus by Imperial College London

by Dr. Sam Cooper & Dr. David Dye

https://www.youtube.com/playlist?list=PLiiljHvN6z193BBzS0Ln8NnqQmzimTW23

Note: backpropagation is a key algorithm for training deep neural nets that rely on Calculus. Get familiar with concepts like chain rule, Jacobian, gradient descent,.
📜 The Matrix Calculus You Need For Deep Learning

by Terence Parr & Jeremy Howard

https://arxiv.org/abs/1802.01528 

Note: In deep learning, you need to understand a bunch of fundamental matrix operations. If you want to dive deep into the math of matrix calculus this is your guide.
📺 Mathematics for Machine Learning - Linear Algebra

by Dr. Sam Cooper & Dr. David Dye

https://www.youtube.com/playlist?list=PLiiljHvN6z1_o1ztXTKWPrShrMrBLo5P3

Note: a great companion to the previous video lectures. Neural networks perform transformations on data and you need linear algebra to get better intuitions.
📘 Information Theory, Inference and Learning Algorithms

by David J. C. MacKay

Note: When you are applying machine learning you are dealing with information processing which in essence relies on ideas from information theory such as entropy and KL Divergence,...
You can follow @omarsar0.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: