NoticeThis site demonstrates one possible use of this domain. For acquisition, partnership, or investment inquiries, please use our contact form.

Research / Safety

Safety & alignment organisations

Independent labs and state-run institutes that evaluate frontier models and research how to align them.

AI safety research splits roughly into three camps: theoretical alignment (how would we make a superhuman system reliably do what we want?), empirical alignment (what works on today's frontier models?), and evaluation (how dangerous are current systems, really?). The organisations below span all three. Several are now embedded in national-security infrastructure through formal evaluation agreements with frontier labs.

  1. 01

    Machine Intelligence Research Institute (MIRI)

    2000 · Berkeley, USA

    The longest-running AGI-safety research organisation. Pioneered formal work on decision theory, corrigibility, and misalignment risk.

    Agent foundationsTheory
  2. 02

    Alignment Research Center (ARC)

    2021 · Berkeley, USA

    Theory and evaluations group founded by former OpenAI alignment lead Paul Christiano. Spawned the dangerous-capability evaluation organisation METR.

    Eliciting Latent KnowledgeEvals
  3. 03

    METR

    2023 · Berkeley, USA

    Model Evaluation and Threat Research. Conducts independent autonomous-capability evaluations of frontier models, including for OpenAI, Anthropic, and the US AISI.

    Autonomy evalsPre-deployment
  4. 04

    Apollo Research

    2023 · London, UK

    Specialises in evaluating frontier models for deceptive and scheming behaviour, and in publishing case studies used by safety institutes worldwide.

    DeceptionScheming evals
  5. 05

    Redwood Research

    2021 · Berkeley, USA

    Empirical alignment lab focused on AI control: techniques that work even if models are misaligned. Co-publishes prominent work with Anthropic.

    ControlAdversarial training
  6. 06

    UK AI Security Institute (AISI)

    2023 · London, UK

    Government institute conducting pre-deployment evaluations of frontier models on behalf of the United Kingdom. The first state-run frontier-model evaluator.

    National evalsStandards
  7. 07

    US AI Safety Institute (US AISI)

    2024 · Gaithersburg, USA

    Housed at NIST. Develops US technical standards for frontier-model evaluation and red-teaming, including formal evaluation agreements with OpenAI and Anthropic.

    NISTEvalsRed teaming
  8. 08

    Center for AI Safety (CAIS)

    2022 · San Francisco, USA

    Non-profit that produced the widely signed 2023 statement on extinction risk from AI. Runs technical research, the SafeBench benchmark, and policy outreach.

    Field buildingRisk research
  9. 09

    FAR AI

    2022 · Berkeley, USA

    Independent research non-profit that incubates new alignment research agendas and convenes the Alignment Workshop series.

    Adversarial robustnessIncubation
  10. 10

    Conjecture

    2022 · London, UK

    Lab pursuing 'cognitive emulation' as a safer alternative to opaque end-to-end systems, alongside vocal policy advocacy.

    Cognitive emulationGovernance

How to use this list: follow at least one independent evaluator (METR or Apollo) and one government institute (UK AISI or US AISI) to triangulate official capability claims against external red-teaming.