So little time, so much to learn.
Several books focus on deep learning have been written in the last few years. The competition is intense and it’s so hard to pick the best ones. We’ve certainly missed very good candidates but we believe these books are more than enough to fill your time dedicated to reading.
We have probably never finished a single one of those books ourselves. Still, we occasionally visit specific chapters or sections. The reason is simple: ingesting exactly the information you need at that time.
In contrast to other “best/top” AI book lists you can find out there, we have spent at least a couple of hours on each book and thus provide an honest review. Finally, we include our book (Deep learning in production), not because we have to, but because we sincerely believe it’s worth being on the list.
Also, note that some of the links below might be affiliate links, and at no additional cost to you, we will earn a commission if you decide to make a purchase after clicking through. If you want to support us, feel free to use them. Otherwise, feel free to ignore them.
After careful consideration, we divided 4 axons of approaching the topic:
Machine and Deep Learning fundamentals (for beginners).
Framework-centered books: Pytorch, Tensorflow and Keras.
MLOPs: cloud, production, and deep learning engineering.
Deep learning theory.
You can choose the one that works best for you!
Machine and Deep Learning fundamentals
The Hundred-Page Machine Learning Book by Andriy Burkov
If you are a newcomer this is the book for you. No discussion. If you are not, you will probably find it boring and overlapping with things you already know. Unfortunately, this book did not exist when we started learning ML, and thus we had to dig all around for information.
The first two chapters focus on machine learning formulation, notation, and key terminology. Later on, Burkov analyzes the most important ML algorithms such as Regression, Decision Trees, Support vector machines, and k-Nearest neighbors. Chapter 4 is all about gradient descent and the learning process, while Chapter 5 is a collection of best practices; namely, feature engineering, regularization, hyperparameter tuning, and more. Chapter 6 is dedicated to neural networks.
Afterward, Burkov takes an interesting turn and discusses how one can use the aforementioned methods to solve specific problems. He explains common challenges, pitfalls as well as targeted solutions. The book closes with other forms of learning such as unsupervised, self-supervised, and recommender systems.
Things we like about this book:
The consistency and scientific notation. It really sets up very solid principles for your ML career.
It uses the “read first, buy later” principle.
It’s only 160 pages (despite the title).
The very good visualizations.
It covers a very wide range of ML techniques from regression, decision trees, SVM to neural networks, ensemble learning, and unsupervised methods.
Why this might not be appealing to you:
It’s quite math-heavy with limited code examples.
It barely touches deep neural networks.
Explanations for each method can feel a bit shallow due to the small size of the book.
A visual introduction to Deep Learning by Meor Amer
There are many visual learners out there. If you are one of them and want to start learning deep learning directly, this one's for you! You can build your own visual intuitions. Overall, we find the book very easy to parse as there is a good balance of figures and text. The book has less math and more illustrations compared to the 100-Page ML Book.
What we liked: the attention to detail in explaining backpropagation without getting lost in the math. Backpropagation is undeniably really hard to teach. We believe Meor has done a great job in that respect. What’s more, performance metrics are thoroughly analyzed such as the confusion matrix and the F1 score.
On the other hand, coders will find it hard to commit. The book provides the fundamental part of the theory but trying things out is left for the reader. Since the book is quite general and introductory, there would be a gap between theory and practice.
Available in: Gumroad
Pytorch, Tensorflow and Keras centered handbooks
Deep Learning with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann
There is only one book to learn Pytorch at any level. I occasionally refer back to this book from time to time. The book has 3 distinct chapters.
Part 1: The first 3 chapters provide a very smooth introduction to PyTorch and tensor operations. But Chapter 4 of this book is a game-changer. It literally describes how to take any piece of data, a video, or a line of text, and represent it as a tensor. It covers medical images, tabular data and text with concrete examples, which I would find extremely valuable as a beginner. Chapter 5 and 6 cover all the basics around the learning process with simple neural nets (backpropagation etc.), focused on the hands-on coding part in Pytorch.
Part 2 tackles all aspects of approaching a real-world problem related to detecting cancer and lung nodules from 3D images data. It walks you through the whole design and thinking process. All the required steps that you would need to follow as an ML modeling researcher. Although I am a bit biased here, I love this part of the book and I honestly think the presented approach here would be transferable to tackling new problems.
Part 3 covers model exporting from Pytorch and even presents the required steps to perform inference or mobile devices. Even though I am not an expert here, I find it amazing for engineers who want to learn how to optimize their trained models to be served efficiently and used in embedded devices with limited hardware resources.
You can get an exclusive 35% discount by using the code blaisummer21 for all books from Manning Publications.
Deep Learning with Python 2nd Edition by François Chollet
This phenomenal book is based on the Keras framework. The 2nd version of the book is currently available with a whole bunch of new additions! I strongly recommend going for the 2nd version of this book.
François Chollet set out on a big journey teaching deep learning from scratch. I find the writing style of the author close to my learning style, even though I am not using TensorFlow and Keras extensively. I am especially interested in his intuitions related to ML and interpolation as explained in his tweetstorm:
Back to the book, the first 4 chapters provide the newcomer to ML with the foundations such as tensor operations, backpropagation, basic Keras modules, as well as approaching classification and regression problems.
Chapter 5 analyses the trade-off between optimization and generalization and how it relates to the training data. It explains why well-trained models generalize via approximating the latent manifold of their data and can make good predictions of new inputs via interpolation.
Chapter 6 teaches you how to deal with a new machine learning project from setting realistic goals, collecting the data, beating a good baseline, and deploying. Chapter 7 illustrates how to get a better understanding of the Keras API and callbacks.
Chapters 8 and 9 provide a thorough overview of deep learning in computer vision, by leveraging the convolutional neural networks for image classification and image segmentation. Chapter 10 focused on processing time series with recurrent neural networks, while Chapter 11 introduces the transformer architecture to process text data.
Chapter 12 is really a thing. Various generative models are presented to generate new text, images. I am excited about how Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAE) are explained and the insights about the latent space.
Finally, the book covers advanced concepts for the real world such as hyperparameter tuning, model ensembles, mixed-precision training, multi-GPU or multi-TPU training.
AI and Machine Learning for Coders: A Programmer's Guide to Artificial Intelligence by Laurence Moroney
If you’re looking for a complete tutorial on Tensorflow, this is the best choice in our opinion. Laurence Moroney is a Lead AI Advocate at Google with vast experience in Tensorflow and its related libraries. The book is divided into two distinct sections.
The first is a deep dive into machine learning applications and how we can utilize Tensorflow to develop them. Examples include computer vision, natural language processing, time series analysis, and sequence models. You will learn:
How to build CNNs and RNNs with Tensorflow
How to process text, images and time-series data
How to utilize Tensorflow Datasets for data processing and exploration
The second section is all about using these models in real-life applications. The reader will familiarize themselves with model deployment on mobile or web applications. You will explore:
How to embed models in Android or iOS with Tensorflow Lite
How to take advantage of Tensoflow.js
What is Tensorflow serving and how to deploy your model
As you may have guessed, the book is very “hands-on”, with lots of code snippets and nice visualizations. The only drawback I can think of is that it’s quite opinionated in terms of libraries, which might be a turn-off for some people.
Deep learning in production by Sergios Karagianakos
Deep learning in production takes a hands-on approach to learn MLOps by doing. The premise of the book is that the reader starts with a vanilla deep learning model and works their way towards building a scalable web application. Full with code snippets and visualizations, it’s a great resource for ml researchers and data scientists with a limited software background.
Each chapter deals with a different phase of the machine learning lifecycle. After discussing the design phase, the reader will familiarize themselves with best practices on how to write maintainable deep learning code such as OOP, unit testing, and debugging. Chapter 5 is all about building efficient data pipelines, while Chapter 6 deals with model training in the cloud as well as various distributed training techniques.
Moving on, the book deals with serving and deployment techniques, while emphasizing on tools such as Flask, uWSGI, Nginx, and Docker. The final two chapters explore MLOPs. More specifically, they discuss how to scale a deep learning application with Kubernetes, how to build end-to-end pipelines with Tensorflow Extended, and how to utilize Google cloud and Vertex AI.
Some things to note:
The entire code is written with Tensorflow 2.0.
The book is quite opinionated in terms of libraries but tries to focus on the actual practices than the libraries themselves.
Sometimes it can feel a bit shallow because going into every last area is impossible. The goal is to guide the reader to understand the things they need to learn, not diving into every little detail.
Machine learning engineering by Andriy Burkov
Machine learning engineering is the second book by Burkov and is a great reference book of the entire ML lifecycle. Burkov does an excellent job aggregating design patterns and best practices on how to build machine learning applications. When I first read this book, I felt like it contained all of the google searches and browser bookmarks of my previous years.
Similar to the previous book, each chapter focuses on a separate phase of the ML lifecycle. Starting from the design phase, it describes the challenges and priorities of an ML project. Moving on to data processing and feature engineering, you will find clear explanations of frequently used industry terms, as well as common pitfalls with their corresponding solutions.
The training and evaluation phase is split into three chapters, where Burkov analyzes how to improve the accuracy of the model using techniques such as regularisation, hyperparameter tuning, and more. It also deals with problems such as distribution shift, model calibration, a/b testing. The final two chapters are my personal favorites, as they discuss deployment strategies, model serving, and maintenance.
The book focuses on the actual practices without providing many code examples and real-life applications.
Sklearnis the main library used throughout the book. Different frameworks and tools are also mentioned but without going into many details.
Sometimes it can feel like a huge checklist of “good-to-know” concepts that someone could use for more research.
Deep learning theory
Finally, there is only one book when it comes to deep learning theory. I purposely left the theory in the end. Why? Because if you start reading this book page by page it’s unlikely you will finish it. The “Deep learning” book is more of a handbook to refer back to for deeper understanding and reliable information from a mathematical perspective.
Deep Learning (Adaptive Computation and Machine Learning series) by Ian Goodfellow, Yoshua Bengio, Aaron Courville
This book introduces a broad range of topics in deep learning theory. It establishes a solid mathematical background. Mathematical areas that are covered include linear algebra, probability theory, information theory, and numerical computation.
Furthermore, the book’s illustrated deep learning techniques like regularization, optimization algorithms, convolutional networks, sequence modeling. Interesting non-commonly covered topics include online recommendation systems, bioinformatics, and video games.
Finally, the book offers insightful theoretical perspectives, such as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models.
The accompanying website offers supplementary material for both readers and instructors.
There is not a one-size-fits-all book. For this purpose, we created this overview with our personal perspectives in it. We believe that you will find the book that best matches your skills and interests. Thanks for your interest in deep learning and stay tuned by subscribing to our newsletter.
* Disclosure: Please note that some of the links above might be affiliate links, and at no additional cost to you, we will earn a commission if you decide to make a purchase after clicking through.