Conference on frontier AI safety frameworks

Note to readers: we changed our name to the AI Security Institute on 14 February 2025. Read more here.

At the AI Seoul Summit, 16 global AI companies committed to develop and publish safety frameworks for frontier AI systems before the French AI Action Summit in February 2025. These frameworks aim to define thresholds above which frontier AI systems would impose intolerable risk on society and to spell out how they intend to identify and mitigate risk to keep it below the thresholds, in a transparent, accountable manner. Committing to producing such frameworks is a laudable step.

However, this was only the first step. The science of AI and AI safety is young. To advance this science, we are organizing a conference, as a "Road to France" event, that will convene experts from signatory companies and from research organisations to discuss the most pressing challenges in the design and implementation of frontier AI safety frameworks.

The conference will take place on 21-22 November in the San Francisco Bay Area and is co-organized between the UK’s AI Safety Institute and the Centre for the Governance of AI. It is an invite-only conference for approximately 100 attendees.

The conference follows the first meeting of the International Network of AI Safety Institutes, which will be hosted by the US government and take place from November 20-21 2024 in San Francisco.

In advance of the conference, we are asking attendees to provide submissions discussing AI safety frameworks. To provide transparency on the conference’s agenda, we are sharing the call for submissions below. Closer to the conference, we also intend to publish a lightly edited set of conference proceedings, including most submissions.

‍

Our Call for Submissions

From all conference invitees, we are welcoming submissions on the following topics concerning the Frontier AI Safety Commitments:

How safety frameworks can be developed and improved:

Existing and draft safety frameworks: Signatories of the commitments are welcome to present their current or draft safety frameworks (or parts thereof) to solicit feedback and discussion.

Improving existing safety frameworks: How can existing safety frameworks be strengthened? How can we adapt best practices from other industries?

Building on safety frameworks (Commitment V): How will safety frameworks need to change over time as AI systems' capabilities improve? How do they need to change when AI systems become capable of posing intolerable levels of risk?

Support for AI safety frameworks: What are common challenges for companies that are yet to produce a frontier AI system and/or a safety framework? What kinds of resources would they find helpful? How can governments, academia, companies and civil society, and other third parties support them better?

Approaches to achieving Outcome 1: “Organisations effectively identify, assess and manage risks when developing and deploying their frontier AI models and systems.”

Model evaluation (Commitment I): What is the role of evaluations in safety frameworks? What does current best practice look like? In what ways might current evaluations fail to identify risks? Where is the science of evaluation lacking and where should it go from here?

Threat modelling and risk assessment (Commitment I): How should frontier AI developers assess risks from their systems? How should evaluation results feed into risk assessment? What additional factors, beyond evaluation results, should play a role in risk assessments? How large are the different risks currently?

Thresholds for intolerable risk (Commitment II): How are thresholds in safety frameworks defined today and how could they be improved? How can the adequacy of these thresholds be validated and assessed? Should thresholds focus exclusively on model capabilities or also take other factors into account? Who should set thresholds and why?

Model deployment risk measures (Commitments III-IV): What techniques are being used to reduce the risks from models during development and around deployment? How can these measures be improved? Are there new technologies that need to be developed? How can the adequacy and robustness of safeguards be assessed?

Cybersecurity measures (Commitments III-IV): What measures can be put in place to secure model weights? What is the risk of model exfiltration? Which models warrant what level of security measures? How effective and costly are different security measures?

Approaches to achieving Outcome 2: “Organisations are accountable for safely developing and deploying their frontier AI models and systems.”

Internal governance and risk management processes to deliver safety frameworks (Commitment VI): What internal governance and risk management processes are currently being implemented by organisations? Where can they be improved? What can be learned from practices in other industries?

Approaches to achieving Outcome 3: “Organisations’ approaches to frontier AI safety are appropriately transparent to external actors, including governments.”

External accountability and transparency (Commitments VII-VIII): What information should be shared publicly? What should be shared to select groups? What should be shared to governments only? What procedures and infrastructure could be put in place to reduce security and confidentiality concerns?

Third-party scrutiny (Commitment VIII): What external model access is currently being provided for frontier systems? How can we further foster the development of the third-party evaluation and auditing industry? What technology and institutions need to be developed to facilitate more robust external scrutiny? What chances to input to the development of safety frameworks do external actors have?

‍