Please enable javascript for this website.

AISI Research | Developing and conducting model evaluations. Advancing foundational safety and societal resilience research.

AISI develops and conducts model evaluations to assess risks from cyber, chemical, biological misuse; autonomous capabilities and the effectiveness of safeguards. We also are also working to advance foundational safety and societal resilience research.

AISI brand artwork

Pre-Deployment Evaluation of OpenAI’s o1 Model

Research

December 18, 2024

The UK Artificial Intelligence Safety Institute and the U.S. Artificial Intelligence Safety Institute conducted a joint pre-deployment evaluation of OpenAI's o1 model

Long-Form Tasks

Research

December 3, 2024

A Methodology for Evaluating Scientific Assistants

Pre-Deployment Evaluation of Anthropic’s Upgraded Claude 3.5 Sonnet

Research

November 19, 2024

The UK Artificial Intelligence Safety Institute and U.S. Artificial Intelligence Safety Institute conducted a joint pre-deployment evaluation of Anthropic’s latest model

Safety case template for ‘inability’ arguments

Research

November 14, 2024

How to write part of a safety case showing a system does not have offensive cyber capabilities

Announcing Inspect Evals

Research

November 13, 2024

We’re open-sourcing dozens of LLM evaluations to advance safety research in the field

Bounty programme for novel evaluations and agent scaffolding

Research

November 5, 2024

We are launching a bounty for novel evaluations and agent scaffolds to help assess dangerous capabilities in frontier AI systems.

Early lessons from evaluating frontier AI systems

Research

October 24, 2024

We look into the evolving role of third-party evaluators in assessing AI safety, and explore how to design robust, impactful testing frameworks.

Should AI systems behave like people?

Research

September 25, 2024

We studied whether people want AI to be more human-like.

Early Insights from Developing Question-Answer Evaluations for Frontier AI

Research

September 23, 2024

A common technique for quickly assessing AI capabilities is prompting models to answer hundreds of questions, then automatically scoring the answers. We share insights from months of using this method.

Conference on frontier AI safety frameworks

Research

September 19, 2024

AISI is bringing together AI companies and researchers for an invite-only conference to accelerate the design and implementation of frontier AI safety frameworks. This post shares the call for submissions that we sent to conference attendees.

Cross-post: "Interviewing AI researchers on automation of AI R&D" by Epoch AI

Research

August 27, 2024

AISI funded Epoch AI to explore AI researchers’ differing predictions on the automation of AI research and development and their suggestions for how to evaluate relevant capabilities.

Safety cases at AISI

Research

August 23, 2024

As a complement to our empirical evaluations of frontier AI models, AISI is planning a series of collaborations and research projects sketching safety cases for more advanced models than exist today, focusing on risks from loss of control and autonomy. By a safety case, we mean a structured argument that an AI system is safe within a particular training or deployment context.

Advanced AI evaluations at AISI: May update

Research

May 20, 2024

We tested leading AI models for cyber, chemical, biological, and agent capabilities and safeguards effectiveness. Our first technical blog post shares a snapshot of our methods and results.

International Scientific Report on the Safety of Advanced AI: Interim Report

Research

May 17, 2024

This is an up-to-date, evidence-based report on the science of advanced AI safety. It highlights findings about AI progress, risks, and areas of disagreement in the field. The report is chaired by Yoshua Bengio and coordinated by AISI.

Open sourcing our testing framework Inspect 

Research

April 21, 2024

We open-sourced our framework for large language model evaluation, which provides facilities for prompt engineering, tool usage, multi-turn dialogue, and model-graded evaluations.

Our approach to evaluations

Research

February 9, 2024

This post offers an overview of why we are doing this work, what we are testing for, how we select models, our recent demonstrations and some plans for our future work.