Since February, we released our first technical blog post, published the International Scientific Report on the Safety of Advanced AI, open-sourced our testing platform Inspect, announced our San Francisco office, announced a partnership with the Canadian AI Safety Institute, grew our technical team to >30 researchers and appointed Jade Leung as our Chief Technology Officer.
In the run-up to the jointly hosted Seoul AI Safety Summit, the UK's AI Safety Institute is announcing:
We have been in operation for nearly a year, and this is our fourth progress report.
In just four weeks, it will be one year since I joined the UK government as Chair of the AI Safety Institute. I could not be prouder of what the team has achieved in this short period.
When I first arrived in Whitehall in June 2023, there was just one AI researcher working for the Department for Science, Innovation and Technology - Nitarshan Rajkumar, the Secretary of State's AI Policy Advisor.
We have spent the year building:
These are the outward signs of progress. Behind them sits a huge investment of ingenuity and energy to realise this vision of a startup inside the government; I leave it to the reader to speculate on the difficulties and bureaucracy we've overcome. I am grateful to those who have shown me how the sausage is made.
The job of building this incredible organisation is far from over. But now we have momentum. We can start delivering products. In a world of rapidly moving technology, we believe the government can only keep pace if it can ship fast and iterate.
Today, we've published our first-ever technical blog post. This sets out headline results from a baseline evaluations exercise AISI conducted on publicly available large language models in April 2024. We ran tests across Cyber, Chem-Bio, Safeguards and Autonomous Systems.
We found some domains in which models could be used to obtain knowledge that could be used for positive but also harmful purposes, as well as some domains in which the models struggled. We also found that the safeguards built into these models are vulnerable to even basic "jailbreaks."
This isn't our first testing exercise. As the Chancellor announced in the Budget at the start of the year, we ran our tests on a frontier model before it was deployed. Following the testing agreement at Bletchley, we have seen real engagement from the leading companies in the field on our pre-deployment testing and will share more in due course.
I'm not sure how common it is for governments to ship open-source software, but we have open-sourced Inspect, a software library that enables testers to assess the specific capabilities of individual models. It is now freely available for the AI community to use.
One of the structural challenges in AI is the need for coordination across borders and institutions. I believe academia, startups, large companies, government and civil society all play a role, and open source can be a mechanism to coordinate more broadly.
Please test drive Inspect. Use it to evaluate the safety of AI systems. We would love feedback.
Just last week, we published the interim International Scientific Report on the Safety of Advanced AI. This report, an outcome of the Bletchley Summit, brought together representatives from 30 countries, the EU and the UN to synthesise the current state of the capabilities and risks of advanced AI systems for the first time.
The report is chaired by Turing Award-winning computer scientist Yoshua Bengio. We in the AI Safety Institute provide the Secretariat. The report identifies risks posed by General-Purpose AI (GPAI), evaluates technical methods for assessing and mitigating those risks and highlights areas of disagreement among scientists. It is inspired by the Intergovernmental Panel on Climate Change.
A final report will be published ahead of the France AI Summit and will incorporate additional evidence from academia, civil society and our international partners.
We are not doing this alone.
Last month, DSIT Secretary of State Michelle Donelan and US Secretary of Commerce Gina Raimondo signed an MoU, which soldered together the UK and US AI Safety Institutes. Our commitment is to work interoperably on AI safety testing, safety standards and safety research.
And just today, Secretary of State Michelle Donelan and her Canadian counterpart François-Philippe Champagne announced an initial partnership between our AI Safety Institutes, as both countries seek to operationalise the Bletchley Declaration.
The next step is the Seoul AI Summit, which the team are attending this week.
This is just the start of our international collaboration; we want to build a network of AI Safety Institutes and equivalent government organisations. That network can synthesise international work on safety standards, testing and research, distributing our efforts more efficiently and making it simpler for frontier AI companies to engage with countries on AI safety testing.
The goal of the AI Safety Institute is to measure risks at the frontier. So, what comes next?
The field of AI is moving so fast it is hard to predict. But there is a possibility that next-generation models with higher accuracy can unlock truly capable agentic systems – AI that can do things for you. We are particularly interested in evaluating how these tools might provide an uplift to a malicious actor in domains like cybercrime or chem-bio.
Many current concerns are AI systems providing sensitive knowledge that can assist a potentially bad actor, like a more advanced, uncensored version of search. However, with continued progress in AI agents, we might see a qualitative change in the kind of risks posed – AI agents might be able to help a malicious actor carry out actions in the real world that they previously would not have been capable of at a given speed, scale and/or competence. As the big AI developers start to focus on developing AI agents directly, we could see large improvements in their capabilities – leaving us uncertain as to what capabilities we might have by the end of next year.
Given the potential risks, agents are a big topic internally at AISI – we think this is an area where having technical experts in the heart of government might provide huge value. Our evaluations teams are focused on building out tasks which track the ability of systems to help with the end-to-end execution of some threat models, like forms of cybercrime. We've also been hard at work making sure that our internal agents are as strong as possible – placing us firmly in the top 3 teams in the popular GAIA (General AI Assistants) benchmark (we're coming for you, 'Multi-Agent Experiment v0.1"! ).
Whilst evaluations test the capabilities of models, many of the risks from AI won't be inherent to the model; the risks will come from the deployment of the model in a given context. As well as asking whether models have dangerous capabilities, we also need to ask whether society is resilient. Therefore, our Secretary of State, Michelle Donelan, will be announcing at the Digital Ministers Day of the Korean AI Safety Summit an exciting new programme AISI is launching to increase societal resilience to these risks.
Finally, I wanted to recap why we are doing this. The UK AISI sits within a broader AI strategy driven by the Prime Minister and Secretary of State Michelle Donelan. The core tenets of this are:
If you're excited by the prospect of working in a fast-paced, mission-driven team at the frontier of AI research, you can view our current vacancies here. We have opened Research Engineer and Research Scientist roles across our London teams, and we will be opening SF office roles soon.