Technical deep dives
The textbooks, courses, and surveys most worth your time if you want to genuinely understand how modern AI systems work.
The list is balanced across foundations, current frontier architectures, training, and interpretability. Most entries are free. Pair one textbook with one current paper for the fastest progress.
- 01
Deep Learning
BookGoodfellow, Bengio, Courville · 2016The free, comprehensive textbook on deep learning.
Why read this. Foundational; still the standard reference.
- 02
The Little Book of Deep Learning
BookFrançois Fleuret · 2023A 200-page modern introduction to deep learning in PDF.
Why read this. The fastest way to refresh the maths.
- 03
Stanford CS336: Language Modeling from Scratch
CourseTatsunori Hashimoto et al. · 2025Build a modern LLM end-to-end across one semester.
Why read this. Unmatched depth on the production stack.
- 04
The Transformer Family v2
PostLilian Weng · 2023A 100-page survey-style post mapping transformer variants.
Why read this. Best single-document reference for architecture choices.
- 05
A Mathematical Framework for Transformer Circuits
PaperAnthropic · 2021Foundational interpretability paper analysing 1- and 2-layer attention-only transformers.
Why read this. Required reading for interpretability work.
- 06
Scaling Laws for Neural Language Models
PaperKaplan et al. · 2020The original empirical scaling laws.
Why read this. Explains why scaling is taken so seriously.
- 07
Training Compute-Optimal Large Language Models
PaperHoffmann et al. · 2022Chinchilla rebalances compute toward more tokens.
Why read this. Reset the field's understanding of optimal training.
- 08
Reinforcement Learning: An Introduction
BookSutton & Barto · 2018The canonical RL textbook, free online.
Why read this. RL is back at the centre of frontier training via RLHF and reasoning models.
- 09
Spinning Up in Deep RL
CourseOpenAI · 2018OpenAI's pragmatic introduction to deep RL with code.
Why read this. Best onramp into modern policy-gradient methods.
- 10
Probabilistic Machine Learning
BookKevin Murphy · 2022Two-volume modern probabilistic ML textbook, free online.
Why read this. Deep statistical grounding most ML curricula skip.
- 11
The Annotated Transformer
PostHarvard NLP / Sasha Rush · 2018, updated 2022The original Transformer paper, line-by-line with executable code.
Why read this. Learn by running, not just reading.
- 12
Foundations of Diffusion Models
PaperCalvin Luo · 2022A clean tutorial on the mathematics behind diffusion models.
Why read this. Best single document for diffusion intuition.
- 13
Karpathy: Neural Networks Zero to Hero
CourseAndrej Karpathy · 2022–2024Code-along video series from autograd to GPT.
Why read this. Builds intuition no textbook can.
- 14
Sparse Autoencoders Find Highly Interpretable Features
PaperCunningham et al. · 2023Demonstrates that SAEs extract monosemantic features from language models.
Why read this. The breakthrough behind modern mechanistic interpretability.
- 15
Survey of LLM Reasoning
PaperSun et al. · 2024Survey of reasoning methods, including chain-of-thought, search, and verifier-based approaches.
Why read this. Catch up on the reasoning-model wave.