Our Chief Scientist, Geoffrey Irving, on why he joined the UK AI Safety Institute and why he thinks other technical folk should too
(This is a long-form version of an earlier tweet thread.)
I’ve been working on AI and machine learning for 10 years, including leading the Reflection Team at OpenAI and the Scalable Alignment Team at DeepMind. But this year, for the first time, I took a job in the government and joined the UK AI Safety Institute. People often ask me why I did this (often because they are considering a similar leap).
I continue to think direct technical work on AI and AGI safety is critically important: we have a ton of hard sociotechnical challenges, but I think good solutions will be available given sufficient time.
In contrast, at AISI I am doing technical work motivated by policy, with two broad goals:
There are two broad reasons why I switched:
First, I worry that progress on safety is too slow relative to the rate of advancement in AI. This is true both for conceptual and theoretical safety work and for empirical methods. I am hopeful about the in-principle effectiveness of many safety approaches given sufficient time, but we do need that sufficient time. Even restricting to known unknowns: all existing technical safety approaches have key conceptual or empirical milestones to hit before we can confidently rely on them. Many of these problems have been known for years and progress against them has been slow.
There is thus immense value in lifting risks and uncertainties into the open, so that society and governments can make informed decisions. Most employees and leaders at AI labs mean well, but market forces mean that there is constant pressure to prioritise speed over caution. I believe governments have a key role in enabling coordination towards safer actions. Race dynamics can exacerbate risks, but are not inevitable.
Second, I felt the impact I could have at AISI was much larger than in technical safety teams in labs. All aspects of safety are understaffed, but technical work adjacent to policy is even more understaffed, especially among the smaller set of senior researchers. This was backed up by speaking to a lot of people about the decision: all but one person thought AISI was the better choice. It would be bad if all technical safety folk were policy-adjacent, but more is the right choice on the margin.
This was still a significant shift for me, and a bet that I would enjoy policy-adjacent work. This bet has paid off! I am finding it extremely motivating to work at an organisation that is focused on safety, and a personal relief to be doing work I’m confident is good overall. Technical safety work in labs both improves safety and speeds up the overall rate of progress on AI. I hoped that the safety benefits of this work would outweigh the potential risks from speeding up AI progress, and I think the arguments for this are correct in many cases, but I found them uneasy to live in day to day.
At the time I started at AISI in April 2024, the bulk of our technical work was empirical evaluations for concerning capabilities and safeguard effectiveness. There was a fun meeting when I was deciding whether to join where I asked what I’d be working at AISI if evaluations were already covered. One of the responses was “What do you think?” so I went to the whiteboard and wrote a long list of topics to work on, then thought “Okay, yes there is a ton of work to do.” Most of my time now goes to research on safety cases for loss of control, mapping out arguments for safety across a range of techniques and capability levels (from models too weak to cause severe harms to those strong enough that misalignment is a concern). I also advise on other AISI work, including evaluations and diplomacy.
Talented, smart, fun people focused on making AI and AGI go well, a great mixture between civil service, policy, ML, and other sciences, delightful lunch and whiteboard conversations, acceptable food (there are more important things than free lunch!). Interdisciplinary work is a delight, and I’ve been impressed by conversations with people in the diplomatic, policy, and legal spheres across AISI and the Department of Science, Information, and Technology broadly. There is a sense of common goals throughout, with disagreements focusing on the strategy and details of how to achieve those goals.
Crucially, there is a ton of important work to do now that an organisation like AISI is uniquely able to execute on. It is much easier to avoid risky actions if there is a large space of safe, beneficial actions to choose instead, and AISI has enormous levers to increase the size of this space and take such actions. AISI played a central role in driving discussion and concrete commitments to balance AI progress and risk at both the AI Safety Summit at Bletchley Park and the AI Seoul Summit in Korea. I joined as part of the UK delegation in Seoul and found it both interesting and useful.
So, that is why I decided to join AISI.
If you read that, and some of it resonated with you, then please consider joining our team! We’re hiring research scientists and engineers across our technical teams.