OpenAI's new safety hack: AI models monitor each other

As part of a long-term effort, OpenAI is testing a range of safety techniques.

Noorphoto/Getty

OpenAI is experimenting with technology to make AI models more transparent.
In this method, a powerful AI model explains its thought process to a second AI.
This effort follows major changes made to OpenAI's safety division earlier this year.

OpenAI has developed a new technique to make the thought processes of AI models more transparent: allowing them to talk to each other.

According to Wired, the company plans to showcase the research behind the technology this week and reveal further details in an upcoming paper.

The point is, by having two AI models debate each other, the more powerful model will be more open about its thinking — and that can help humans. Better understand how these models reason about problems.

OpenAI tested the technique by having an AI model solve a basic math problem: a more powerful model explained how to solve the problem, while another model listened to detect errors in the first model's answer.

The technique is one of several that OpenAI has made public over the past few weeks, and it states: [artificial general intelligence] “It's safe and it's useful,” Yining Chen, an OpenAI researcher working on safety, told Wired. The company also announced new measures of progress toward artificial general intelligence.

The company's new initiative comes after months of turmoil in the safety department. In May, OpenAI co-founder and chief research officer Ilya Sutskever announced he was leaving the company, just six months after leading the unsuccessful ouster of CEO Sam Altman. A few hours later, another researcher at the company, Jan Reicke, followed suit. Reicke and Sutskever were co-leaders of OpenAI's Super Alignment Group, a team focused on aligning artificial intelligence systems with human interests. A week later, OpenAI policy researcher Gretchen Krueger joined the ranks of employees leaving, citing “converging concerns.”

Their resignations raised concerns about OpenAI's approach to safety as it develops its technology. Last March, Tesla CEO Elon Musk was one of several experts who signed a letter expressing concern about the rapid pace of AI development. More recently, AI expert and University of California, Berkeley professor Stuart Russell said OpenAI's ambitions to develop artificial general intelligence without fully validating its safety are “simply unacceptable.”

OpenAI did not immediately respond to Business Insider's request for comment.

Source link