Recommended AGI Papers
A starter library of the most important AI papers of the last decade.
Architecture · NeurIPS · 2017
Attention Is All You Need
Vaswani et al.
Introduced the Transformer architecture, replacing recurrence with self-attention.
Every frontier model in 2026 is, at heart, a descendant of this paper.
Read paper ↗Capability · NeurIPS · 2020
Language Models are Few-Shot Learners (GPT-3)
Brown et al.
Showed that scaling Transformers yields strong few-shot performance across many tasks.
Established scale itself as a research direction and launched the modern LLM era.
Read paper ↗Capability · arXiv · 2020
Scaling Laws for Neural Language Models
Kaplan et al.
Empirical relationships between model size, dataset size, compute, and loss.
Turned capability planning into engineering and motivated frontier training budgets.
Read paper ↗Capability · DeepMind / arXiv · 2022
Training Compute-Optimal Large Language Models (Chinchilla)
Hoffmann et al.
Showed that many existing models were over-parameterised relative to training data.
Reshaped training recipes industry-wide.
Read paper ↗Capability · TMLR · 2022
Emergent Abilities of Large Language Models
Wei et al.
Documented capabilities that appear at scale and are absent at smaller scales.
Framed the central scientific puzzle and policy worry of frontier scaling.
Read paper ↗Capability · NeurIPS · 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Wei et al.
Showed that prompting models to reason step-by-step substantially improves multi-step task performance.
Foundation for the modern wave of reasoning models.
Read paper ↗Capability · OpenAI / NeurIPS · 2022
Training Language Models to Follow Instructions with Human Feedback (InstructGPT)
Ouyang et al.
Defined the RLHF recipe used to align modern chatbots.
Made LLMs usable by ordinary people; foundational for ChatGPT-class products.
Read paper ↗Safety · Anthropic · 2022
Constitutional AI: Harmlessness from AI Feedback
Bai et al.
Used an explicit set of principles and AI feedback to reduce reliance on human labelling.
Influential alternative to pure RLHF, shaping current alignment practice.
Read paper ↗Capability · Microsoft Research · 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Bubeck et al.
Argued that GPT-4 shows early but uneven signs of general intelligence.
Crystallised the AGI debate in industry and the press.
Read paper ↗Forecast · Google DeepMind · 2024
Levels of AGI for Operationalizing Progress on the Path to AGI
Morris et al.
Proposed a six-level framework (None to Superhuman) for talking about AGI rigorously.
Widely adopted scaffold for the AGI conversation.
Read paper ↗Capability · OpenAI · 2023
GPT-4 Technical Report
OpenAI
Capabilities, evaluations, and safety overview of GPT-4.
Set the deployment template that subsequent labs followed.
Read paper ↗Capability · Google DeepMind · 2023
Gemini: A Family of Highly Capable Multimodal Models
Google DeepMind
Native multimodal architecture spanning text, vision, audio, and code.
Showed that multimodal training is now the frontier default.
Read paper ↗Capability · Meta · 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Touvron et al.
Open-weight frontier-adjacent model family with detailed training and safety methodology.
Catalysed the open-weight ecosystem that competes with closed labs.
Read paper ↗Architecture · ICLR · 2017
Mixture-of-Experts: Outrageously Large Neural Networks
Shazeer et al.
Conditional computation that scales total parameters without proportional compute cost.
MoE underlies several leading frontier models today.
Read paper ↗Capability · Nature · 2016
AlphaGo: Mastering the Game of Go with Deep Neural Networks and Tree Search
Silver et al.
First system to beat top human Go players, combining deep learning and search.
Watershed for public belief in modern AI's potential.
Read paper ↗Capability · Nature · 2021
Highly Accurate Protein Structure Prediction with AlphaFold
Jumper et al.
Solved protein-structure prediction to near-experimental accuracy.
Showed AI delivering frontier-science breakthroughs, not just chat.
Read paper ↗Safety · arXiv · 2016
Concrete Problems in AI Safety
Amodei et al.
Catalogued five practical safety problems: side effects, reward hacking, scalable oversight, safe exploration, robustness.
Founding agenda for the empirical AI-safety field.
Read paper ↗Safety · arXiv · 2019
Risks from Learned Optimization (Mesa-Optimisation)
Hubinger et al.
Formal analysis of how trained systems can develop internal optimisers with their own objectives.
Central conceptual reference for deceptive-alignment concerns.
Read paper ↗Forecast · Self-published · 2024
Situational Awareness: The Decade Ahead
Leopold Aschenbrenner
Long-form forecast of compute, capability, and geopolitical dynamics through the 2020s.
Widely read by policymakers; defined a major framing of the AGI race.
Read paper ↗Policy · UK DSIT · 2025
International AI Safety Report (interim and 2025)
Yoshua Bengio (chair) et al.
Multi-government state-of-the-science report on advanced AI risks and mitigations.
Most authoritative consensus document on advanced AI risk as of 2026.
Read paper ↗Policy · US NIST · 2023
NIST AI Risk Management Framework
NIST
Voluntary framework for managing AI risk, structured around Govern, Map, Measure, Manage.
Reference framework adopted across US procurement and many companies.
Read paper ↗