Memorandum of Understanding signed on collaborative research into AI safety benchmark set for LLM
San Francisco, May 31, 2024–(BUSINESS WIRE)–Today in Singapore, MLCommons® and AI Verify signed a letter of intent to collaborate on developing a common set of safety testing benchmarks for generative AI models aimed at improving AI safety globally.
A mature safety ecosystem includes collaboration between AI testing companies, national safety agencies, auditors, and researchers. The AI safety benchmarking effort driven by this agreement aims to provide AI developers, integrators, purchasers, and policymakers with a globally accepted baseline approach to safety testing of generative AI.
“There is significant global interest in the generative AI community in developing a common approach to evaluating the safety of generative AI,” said Peter Mattsson, president of MLCommons and co-chair of the AI Safety working group. “The MLCommons AI Verify collaboration is a step toward creating a global, comprehensive standard for AI safety testing, with benchmarks designed to address safety risks across different contexts, languages, cultures, and values.”
The MLCommons AI Safety Working Group, a global group of academic researchers, industry technical experts, policy and standards representatives, and civil society advocates, recently published a proof of concept (POC) for the v0.5 AI Safety benchmark. AI Verify will develop interoperable AI testing tools to inform a comprehensive v1.0 release scheduled for release this fall. Additionally, we are building a toolkit for interactive testing to support benchmarking and red teaming.
“Taking this first step towards a globally recognized AI safety benchmark and testing standard, the AI Verify Foundation is pleased to partner with MLCommons to help partners build trust across the diverse cultural backgrounds and languages in which they develop their models and applications. We invite more partners to join us in this effort to advance the responsible use of AI in Singapore and globally,” said Dr. Ong Cheng Hui, Chair of the Steering Committee of the AI Verify Foundation.
The AI Safety Working Group encourages global participation to collaborate on the development of the v1.0 AI Safety Benchmark Suite and beyond. To contribute, join the MLCommons AI Safety Working Group.
About MLCommons
MLCommons is the global leader in building AI benchmarks. It is an open engineering consortium with a mission to make AI better for everyone through benchmarks and data. MLCommons' foundation began with the MLPerf® benchmarks in 2018, which quickly expanded as a set of industry metrics to measure machine learning (ML) performance and promote transparency in ML and AI technologies. Working with over 125 members, global technology providers, academics, and researchers, MLCommons is focused on collaborative engineering efforts to build tools for the entire AI industry through benchmarks and metrics, public datasets, and best practices.
About AI Verify Foundation
The AI Verify Foundation aims to harness the collective strength and contributions of the global open source community to develop AI testing tools that enable responsible AI. The Foundation promotes AI best practices and standards. The not-for-profit Foundation is a wholly-owned subsidiary of the Infocomm Media Development Authority of Singapore (IMDA).
View source version on businesswire.com: https://www.businesswire.com/news/home/20240530891764/en/
contact address
Kelly Berschauer
kelly@mlcommons.org
