White House collaborates with Anthropic on security rules for AI models

The White House and Anthropic are working on a framework to assess the severity of security flaws in new AI models and guide potential government intervention, according to a senior White House official and government officials familiar with the matter who were granted anonymity to discuss with POLITICO.

The initiative comes after the White House imposed export controls on Anthropic, citing security flaws known in the industry as jailbreaks that forced the company to suspend all users’ access to its latest powerful AI models, Fable 5 and Mythos 5.

POLITICO previously reported that administration officials and Anthropic CEO Dario Amodei are at odds over the seriousness of jailbreaks, but technology trumps government infrastructure in defining and evaluating these disputes. POLITICO, like Business Insider, is part of the Axel Springer Global Reporters Network.

The attempt to create a standardized method for evaluating this and similar incidents in the future highlights how the government is rushing to establish guardrails for a new, stronger model that some fear could threaten the economy and national security if left unchecked.

Anthropic’s negotiations with the administration come as part of Anthropic’s initial defense of its model, reflecting an understanding that no AI model is completely immune to hacking and that governments need to set rules for companies to measure security risks, a recognition echoed by other major AI companies and national leaders at the G7 meeting in France earlier this week.

Discussions between the White House and Anthropic, led by public policy director Sarah Heck and co-founder Tom Brown on the company’s side, aim to develop common benchmarks that can be used to evaluate future jailbreaks, including the extent to which safeguards are circumvented, functionality exposed, and the practical impact of the breach.

Antropic and the White House did not immediately respond to requests for comment.

Export restrictions on anthropics have not yet been lifted, but the move to a technical standards-setting exercise is a sign that negotiations are making progress. Negotiations effectively collapsed on Friday when Anthropic rejected a request to de-deploy Fable, saying the vulnerability was limited and did not constitute a major security flaw.

In response, the White House imposed export controls that barred foreign users from accessing the model, forcing the company to exit the market.

But over the weekend, government officials and Anthropic leaders held a series of lengthy phone conversations with Anthropic co-founder Tom Brown, Secretary of Commerce Howard Lutnick, and National Cyber Director Sean Cairncross. Those conversations led to nearly a week of in-person meetings in Washington. Anthropic sent senior researchers and security experts to the Department of Commerce on Monday to coordinate with government officials.

This story was originally Appeared on POLITICO Courtesy of the Axel Springer Global Reporters Network, which leverages the resources of its newsroom to publish ambitious scoops, investigations, interviews, opinion pieces and analysis. This allows journalists from POLITICO, Business Insider, WELT, BILD, Onet, Fakt and more to collaborate on important stories across platforms and to an international audience of hundreds of millions of people.