Fourth progress report

Note to readers: we changed our name to the AI Security Institute on 14 February 2025. Read more here.

Our fourth progress report

In the run-up to the jointly hosted Seoul AI Safety Summit, the UK's AI Safety Institute is announcing:

We are publishing our first technical blog post on our model evaluations. It contains headline results from an exercise AISI conducted on publicly available frontier models in April 2024. We ran tests across Cyber, Chem-Bio, Safeguards and Autonomous Systems.
The UK has published the first International Scientific Report on the Safety of Advanced AI, coordinating 30 countries on a survey of the scientific evidence on AI risk.
We have open-sourced Inspect, our platform for running AI safety evaluations.
We are opening a new office in San Francisco so that we can continue to work hand in glove with the US and tap global talent for the AISI on both sides of the Atlantic.
Following our MoU to work interoperably with the US AI Safety Institute, Secretary of State Donelan has announced a new partnership to work with the Canadian AI Safety Institute, creating pathways to share expertise to support testing and evaluations work and enable secondment routes between our respective AISIs.
As we continue to build model testing capacity and capabilities, we have now onboarded over 30 technical researchers. We have also appointed Jade Leung, who joined us in October last year, as our Chief Technology Officer.

We have been in operation for nearly a year, and this is our fourth progress report.

First, we built state capacity…

In just four weeks, it will be one year since I joined the UK government as Chair of the AI Safety Institute. I could not be prouder of what the team has achieved in this short period.

When I first arrived in Whitehall in June 2023, there was just one AI researcher working for the Department for Science, Innovation and Technology - Nitarshan Rajkumar, the Secretary of State's AI Policy Advisor.

We have spent the year building:

We've built one of the largest safety evaluation teams globally - drawing talent from across the world. We now have a team of over 30 technical researchers.
These researchers are working with some of the great minds in the field. Our research leadership team includes Geoffrey Irving, Professor Chris Summerfield and Yarin Gal as our Research Directors and Jade Leung as AISI's Chief Technology Officer.
We recruited an exceptional External Advisory Board (EAB) made up of National Security, technical and other specialists, including – but not limited to – Yoshua Bengio, the Director of GCHQ, Anne Keast-Butler and the UK's Deputy National Security Advisor Matt Collins. I'm incredibly grateful to all our EAB members who have supported us over the past year.
At the world's first global summit on AI Safety at Bletchley Park last November, we laid the foundations for the future international governance of AI. 28 countries signed the Bletchley Declaration on AI Safety, including the US, EU and China. They also agreed to an international review of the scientific literature on AI risk. 11 countries and the leading frontier AI companies, including Meta, Google DeepMind, Anthropic and OpenAI, also agreed to collaborate on safety evaluations.
In February, we crystallised our approach to evaluating AI systems, publishing the list of the different types of tests we were developing and the risks we would be measuring.
Today, we announced our plans for a San Francisco office; you will soon be able to work for the UK AI Safety Institute in Silicon Valley.
Finally, you might have also noticed that this has been published on AISI's brand-new website. This is going to be our new home for all of AISI's job adverts, blog posts and any other updates. You can also follow us at @AISafetyInst.

These are the outward signs of progress. Behind them sits a huge investment of ingenuity and energy to realise this vision of a startup inside the government; I leave it to the reader to speculate on the difficulties and bureaucracy we've overcome. I am grateful to those who have shown me how the sausage is made.

…and now we ship

The job of building this incredible organisation is far from over. But now we have momentum. We can start delivering products. In a world of rapidly moving technology, we believe the government can only keep pace if it can ship fast and iterate.

Product 1: Government safety testing

Today, we've published our first-ever technical blog post. This sets out headline results from a baseline evaluations exercise AISI conducted on publicly available large language models in April 2024. We ran tests across Cyber, Chem-Bio, Safeguards and Autonomous Systems.

We found some domains in which models could be used to obtain knowledge that could be used for positive but also harmful purposes, as well as some domains in which the models struggled. We also found that the safeguards built into these models are vulnerable to even basic "jailbreaks."

This isn't our first testing exercise. As the Chancellor announced in the Budget at the start of the year, we ran our tests on a frontier model before it was deployed. Following the testing agreement at Bletchley, we have seen real engagement from the leading companies in the field on our pre-deployment testing and will share more in due course.

‍

Product 2: An open-source evaluations platform

I'm not sure how common it is for governments to ship open-source software, but we have open-sourced Inspect, a software library that enables testers to assess the specific capabilities of individual models. It is now freely available for the AI community to use.

One of the structural challenges in AI is the need for coordination across borders and institutions. I believe academia, startups, large companies, government and civil society all play a role, and open source can be a mechanism to coordinate more broadly.

Please test drive Inspect. Use it to evaluate the safety of AI systems. We would love feedback.

Product 3: International survey on the science of AI risk

Just last week, we published the interim International Scientific Report on the Safety of Advanced AI. This report, an outcome of the Bletchley Summit, brought together representatives from 30 countries, the EU and the UN to synthesise the current state of the capabilities and risks of advanced AI systems for the first time.

The report is chaired by Turing Award-winning computer scientist Yoshua Bengio. We in the AI Safety Institute provide the Secretariat. The report identifies risks posed by General-Purpose AI (GPAI), evaluates technical methods for assessing and mitigating those risks and highlights areas of disagreement among scientists. It is inspired by the Intergovernmental Panel on Climate Change.

A final report will be published ahead of the France AI Summit and will incorporate additional evidence from academia, civil society and our international partners.

Operationalising the Bletchley Declaration

We are not doing this alone.

Last month, DSIT Secretary of State Michelle Donelan and US Secretary of Commerce Gina Raimondo signed an MoU, which soldered together the UK and US AI Safety Institutes. Our commitment is to work interoperably on AI safety testing, safety standards and safety research.

And just today, Secretary of State Michelle Donelan and her Canadian counterpart François-Philippe Champagne announced an initial partnership between our AI Safety Institutes, as both countries seek to operationalise the Bletchley Declaration.

The next step is the Seoul AI Summit, which the team are attending this week.

This is just the start of our international collaboration; we want to build a network of AI Safety Institutes and equivalent government organisations. That network can synthesise international work on safety standards, testing and research, distributing our efforts more efficiently and making it simpler for frontier AI companies to engage with countries on AI safety testing.

What's next? Maybe agents.

The goal of the AI Safety Institute is to measure risks at the frontier. So, what comes next?

The field of AI is moving so fast it is hard to predict. But there is a possibility that next-generation models with higher accuracy can unlock truly capable agentic systems – AI that can do things for you. We are particularly interested in evaluating how these tools might provide an uplift to a malicious actor in domains like cybercrime or chem-bio.

Many current concerns are AI systems providing sensitive knowledge that can assist a potentially bad actor, like a more advanced, uncensored version of search. However, with continued progress in AI agents, we might see a qualitative change in the kind of risks posed – AI agents might be able to help a malicious actor carry out actions in the real world that they previously would not have been capable of at a given speed, scale and/or competence. As the big AI developers start to focus on developing AI agents directly, we could see large improvements in their capabilities – leaving us uncertain as to what capabilities we might have by the end of next year.

Given the potential risks, agents are a big topic internally at AISI – we think this is an area where having technical experts in the heart of government might provide huge value. Our evaluations teams are focused on building out tasks which track the ability of systems to help with the end-to-end execution of some threat models, like forms of cybercrime. We've also been hard at work making sure that our internal agents are as strong as possible – placing us firmly in the top 3 teams in the popular GAIA (General AI Assistants) benchmark (we're coming for you, 'Multi-Agent Experiment v0.1"! ).

Whilst evaluations test the capabilities of models, many of the risks from AI won't be inherent to the model; the risks will come from the deployment of the model in a given context. As well as asking whether models have dangerous capabilities, we also need to ask whether society is resilient. Therefore, our Secretary of State, Michelle Donelan, will be announcing at the Digital Ministers Day of the Korean AI Safety Summit an exciting new programme AISI is launching to increase societal resilience to these risks.

Optimism and empiricism

Finally, I wanted to recap why we are doing this. The UK AISI sits within a broader AI strategy driven by the Prime Minister and Secretary of State Michelle Donelan. The core tenets of this are:

A strong belief that AI is a transformational technology that has an incredible capacity to improve our lives. AI presents a huge economic opportunity for the UK. The sector is already worth more than £3.7 billion every year and employs over 50,000 people.
The government is making major investments to support UK capability, including a £1.5bn investment in AI computing capacity to support government and academia.
With such a fast-moving technology premature or excessively broad regulation can damage innovation. The UK has not yet introduced new legislation and has instead been sharing early thinking via a series of whitepapers.
Optimism does not mean blindly ignoring risks. In October, the Prime Minister took the unusual step of publishing a UK analysis on the risks of AI, including an assessment by the UK intelligence communities and hosting the AI Safety Summit to drive global coordination around these risks.
Building the capacity to empirically assess these risks so that policymakers are informed is critical. This is AISI's core work.
No country can single-handedly tackle this challenge, so the Prime Minister organised the AI Safety Summit, and the Secretary of State Michelle Donelan has driven forward our partnerships with the US and Canadian governments.

If you're excited by the prospect of working in a fast-paced, mission-driven team at the frontier of AI research, you can view our current vacancies here. We have opened Research Engineer and Research Scientist roles across our London teams, and we will be opening SF office roles soon.