Miles Brundage, a prominent former policy researcher at OpenAI, is starting an institute dedicated to the simple idea that AI companies shouldn’t be allowed to grade their own homework.
Today, Brundage officially announced the AI Verification and Evaluation Institute (AVERI), a new nonprofit organization dedicated to advancing the idea that frontier AI models should be subject to external audit. AVERI is also working to establish AI auditing standards.
The announcement coincides with the publication of a research paper co-authored by Brundage and more than 30 AI safety researchers and governance experts that provides a detailed framework for how independent audits of companies building the world’s most powerful AI systems would work.
Mr. Brundage spent seven years at OpenAI as a policy researcher and advisor on how companies should prepare for the emergence of human-like artificial general intelligence. He retired from the company in October 2024.
“One of the things I learned while working at OpenAI is that companies are figuring out these kinds of standards on their own,” Brundage said. luck. “No one is forcing you to work with third-party experts to ensure your safety. They seem to be creating their own rules. ”
It creates risk. Although major AI labs conduct safety and security tests and publish technical reports on the results of many of these assessments, some with the assistance of external “red team” organizations, for now consumers, businesses, and governments have no choice but to trust what AI labs have to say about these tests. No one is forcing us to conduct or report these assessments according to any particular criteria.
Brundage said that in other industries, audits are used to provide assurance to the public, including consumers, business partners and, to some extent, regulators, that products are safe and have been tested in a rigorous manner.
“When you buy a vacuum cleaner, it contains parts such as batteries that are tested by independent laboratories to strict safety standards to ensure they will not catch fire,” he said.
New Institute Drives Policy and Standards
Brundage said AVERI is interested in studying policies to encourage AI institutes to move to a system of rigorous external audits and what the standards for those audits should be, but not in conducting the audits themselves.
“We are a think tank. We are trying to understand and shape this transition,” he said. “We’re not trying to get every Fortune 500 company as a customer.”
He said existing public accounting, auditing, assurance and testing firms could get into the business of auditing AI safety, or start-ups could be created to take on this role.
AVERI announced that it has raised $7.5 million toward its goal of $13 million, which will cover a staff of 14 and two years of operations. Previous funders include Halcyon Futures, Fathom, Coefficient Giving, former Y Combinator president Geoff Ralston, Craig Falls, Good Forever Foundation, Sympatico Ventures, and AI Underwriting Company.
The organization says it has also received donations from current and former non-executives of frontier AI companies. “These people know where the bodies were buried,” Brundage said, “and they want more accountability.”
Insurance companies and investors may force safety audits of AI
Brundage said there could be several mechanisms to encourage AI companies to start hiring independent auditors. One is that large companies purchasing AI models may require audits to provide some assurance that the AI models they are purchasing perform as promised and do not pose hidden risks.
Insurance companies may also push to establish AI audits. For example, an insurance company providing business continuity insurance to a large company that uses AI models for key business processes may require an audit as a condition of underwriting. In the insurance industry, audits may also be required to write insurance policies for major AI companies such as OpenAI, Anthropic, and Google.
“The insurance industry is certainly moving quickly,” Brundage said. “We’re having a lot of conversations with insurance companies,” he said, noting that one AI-focused insurance company, AI Underwriting Company, made a donation to AVERI because “they see the value of audits as a check on compliance with the standards that they’re developing.”
Investors may also request safety audits of AI to ensure it’s not taking on unknown risks, Brundage said. Given that investment firms are now writing millions and billions of dollar checks to fund AI companies, it makes sense that these investors would demand independent audits of the safety and security of the products that fast-growing startups are building. If any of the big labs go public, as OpenAI and Anthropic are reportedly preparing to do in the next year or two, without hiring auditors to assess the risk of their AI models, these companies could be exposed to shareholder lawsuits and SEC prosecution if problems later arise that lead to a significant drop in stock price.
Brundage also said that regulations or international agreements could force AI institutes to hire independent auditors. There is currently no federal regulation of AI in the United States, and it is unclear whether there will be any. President Donald Trump has signed an executive order to crack down on U.S. states that have passed their own AI regulations. The government says this is because it believes a single federal standard is easier for businesses to navigate than multiple state laws. But while moving toward penalizing states that enact AI regulations, the administration has yet to propose its own national standards.
However, in other regions, the foundations of an audit may already be forming. The recently enacted EU AI law does not explicitly require audits of AI companies’ evaluation procedures. But its “General Purpose AI Code of Practice” is more like a blueprint for how frontier AI laboratories can comply with the law, saying that laboratories that build models that could pose “systemic risk” must provide free access to external evaluators to test the models. The law itself states that if an organization deploys AI for “high-risk” use cases, such as underwriting loans, determining Social Security eligibility, or making health care decisions, the AI system must undergo an external “suitability assessment” before being brought to market. Some interpret these provisions of the law and code to inherently imply the need for independent auditors.
Establish an “assurance level” and find a sufficiently qualified auditor
A research paper published alongside AVERI’s launch outlines a comprehensive vision of what a frontier AI audit should look like. The report proposes a framework of “AI assurance levels” from Level 1, which includes third-party testing but with limited access and similar to the external assessments that the AI Institute currently commissions companies to conduct, to Level 4, which provides “treaty-grade” assurance sufficient for international agreements on the safety of AI.
Building a cadre of qualified AI auditors presents unique challenges. AI auditing requires a combination of technical expertise and governance knowledge, which few people have. And those conducting AI audits are often tempted by lucrative offers from the companies they audit.
Brundage acknowledged the challenges, but said they can be overcome. He talked about mixing people from different backgrounds to build a “dream team” with the right skill sets. “There may be some people from existing audit firms, some people from cybersecurity penetration testing companies, some people from AI safety nonprofits, and maybe some academics,” he said.
In other industries, from nuclear power to food safety, standards and independent reviews are often triggered by catastrophes, or at least close calls. Brundage said the hope is that by leveraging AI, audit infrastructure and standards can be established before a crisis occurs.
“From my perspective, the goal is to get to a level of scrutiny that is proportionate to the actual impact and risk of the technology as smoothly and as quickly as possible without going too far,” he said.
