Library / Technical

Technical deep dives

The textbooks, courses, and surveys most worth your time if you want to genuinely understand how modern AI systems work.

The list is balanced across foundations, current frontier architectures, training, and interpretability. Most entries are free. Pair one textbook with one current paper for the fastest progress.

01
Deep Learning
Book
Goodfellow, Bengio, Courville · 2016
The free, comprehensive textbook on deep learning.
Why read this. Foundational; still the standard reference.
Open
02
The Little Book of Deep Learning
Book
François Fleuret · 2023
A 200-page modern introduction to deep learning in PDF.
Why read this. The fastest way to refresh the maths.
Open
03
Stanford CS336: Language Modeling from Scratch
Course
Tatsunori Hashimoto et al. · 2025
Build a modern LLM end-to-end across one semester.
Why read this. Unmatched depth on the production stack.
Open
04
The Transformer Family v2
Post
Lilian Weng · 2023
A 100-page survey-style post mapping transformer variants.
Why read this. Best single-document reference for architecture choices.
Open
05
A Mathematical Framework for Transformer Circuits
Paper
Anthropic · 2021
Foundational interpretability paper analysing 1- and 2-layer attention-only transformers.
Why read this. Required reading for interpretability work.
Open
06
Scaling Laws for Neural Language Models
Paper
Kaplan et al. · 2020
The original empirical scaling laws.
Why read this. Explains why scaling is taken so seriously.
Open
07
Training Compute-Optimal Large Language Models
Paper
Hoffmann et al. · 2022
Chinchilla rebalances compute toward more tokens.
Why read this. Reset the field's understanding of optimal training.
Open
08
Reinforcement Learning: An Introduction
Book
Sutton & Barto · 2018
The canonical RL textbook, free online.
Why read this. RL is back at the centre of frontier training via RLHF and reasoning models.
Open
09
Spinning Up in Deep RL
Course
OpenAI · 2018
OpenAI's pragmatic introduction to deep RL with code.
Why read this. Best onramp into modern policy-gradient methods.
Open
10
Probabilistic Machine Learning
Book
Kevin Murphy · 2022
Two-volume modern probabilistic ML textbook, free online.
Why read this. Deep statistical grounding most ML curricula skip.
Open
11
The Annotated Transformer
Post
Harvard NLP / Sasha Rush · 2018, updated 2022
The original Transformer paper, line-by-line with executable code.
Why read this. Learn by running, not just reading.
Open
12
Foundations of Diffusion Models
Paper
Calvin Luo · 2022
A clean tutorial on the mathematics behind diffusion models.
Why read this. Best single document for diffusion intuition.
Open
13
Karpathy: Neural Networks Zero to Hero
Course
Andrej Karpathy · 2022–2024
Code-along video series from autograd to GPT.
Why read this. Builds intuition no textbook can.
Open
14
Sparse Autoencoders Find Highly Interpretable Features
Paper
Cunningham et al. · 2023
Demonstrates that SAEs extract monosemantic features from language models.
Why read this. The breakthrough behind modern mechanistic interpretability.
Open
15
Survey of LLM Reasoning
Paper
Sun et al. · 2024
Survey of reasoning methods, including chain-of-thought, search, and verifier-based approaches.
Why read this. Catch up on the reasoning-model wave.
Open

Technical deep dives

Deep Learning

The Little Book of Deep Learning

Stanford CS336: Language Modeling from Scratch

The Transformer Family v2

A Mathematical Framework for Transformer Circuits

Scaling Laws for Neural Language Models

Training Compute-Optimal Large Language Models

Reinforcement Learning: An Introduction

Spinning Up in Deep RL

Probabilistic Machine Learning

The Annotated Transformer

Foundations of Diffusion Models

Karpathy: Neural Networks Zero to Hero

Sparse Autoencoders Find Highly Interpretable Features

Survey of LLM Reasoning