OpenAI trained an AI model to generate text that is easily understandable by humans. Here's how they did it:

AI News


Tech companies like Google and ChatGPT-maker OpenAI are innovating to enhance the output provided by chatbots. They are harnessing the power of AI, machine learning, and advanced data analytics to optimize language models and improve responses. In one such case, OpenAI reported a breakthrough in training a strong language model to generate text that is easier for weaker language models to verify. The Microsoft-backed company found that this training also makes the text easier for humans to evaluate.

“Making language models produce understandable text is crucial to making them useful to people, especially when dealing with complex tasks like solving mathematical problems,” OpenAI said in a blog post this week.

The company said that despite its AI models giving virtually correct answers, they proved difficult to understand.

“When we asked human raters to evaluate these highly optimized solutions in a time-constrained environment, they made almost twice as many errors as when evaluating non-optimized solutions. This finding highlights the importance of not only accuracy but also clarity and ease of verifiability in AI-generated text,” they added.

Expanding

How OpenAI found a solution to this problem
OpenAI trained advanced language models to produce text that is both easier to verify even with weaker models and more effectively evaluateable by humans, a process known as readability improvement. The company deployed a two-player “prover-verifier game” with a “prover” that generates a solution and a “verifier” that checks the solution's accuracy.

“To improve readability, we optimize the thought chains of elementary school arithmetic problems so that they can be verified by weaker models, and therefore whether they are more readable to humans. The training procedure is inspired by the prover-verifier game, a game-theoretic framework that encourages learning agents to solve decision problems in a verifiable way,” OpenAI said.

OpenAI says it used a training scheme (large and small models from the GPT-4 family) in which a strong model produces a solution that can be easily verified by a much weaker model: the verifier.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *