Can you use an AI chatbot to ensure that other chatbots’ answers are correct?

AI chatbots have become increasingly accustomed to conversing with humans. The problem, experts say, is that they tend to give inaccurate or nonsensical answers, a term they call “hallucinations.”

Now, researchers have come up with a potential solution: using chatbots to sniff out errors made by other chatbots.

Sebastian Farquhar, a computer scientist at the University of Oxford and co-author of a research paper published in the journal Nature on Wednesday, argues that chatbots such as ChatGPT and Google's Gemini can be used to weed out AI falsehoods.

Chatbots use large-scale language models (LLMs) that can be used for a variety of tasks, such as ingesting large amounts of text from the internet and generating text by predicting the next word in a sentence. The bot finds patterns through trial and error, and then uses human feedback to fine-tune the model.

But there is a downside: chatbots cannot think like humans and cannot understand what we say.

To test this, Farquhar and his colleagues asked a chatbot a question, then used a second chatbot to check for inconsistencies in the answers. This is similar to how police try to trap a suspect by asking them the same question multiple times. If the answers differed widely in meaning, it probably meant they were garbled.

Get caught up in

Stories to keep you up to date

He said the chatbot was asked general trivia questions and elementary school-level math word problems.

The researchers cross-checked the accuracy of the chatbot's ratings by comparing them with human ratings on the same subset of questions: the chatbot matched the human raters 93% of the time, and the human raters agreed with each other 92% of the time. Chatbots rating each other is close enough that it's “unlikely to be a concern,” Farquhar said.

Farquhar said that for the average reader, identifying errors in AI is “fairly difficult.”

He said in an email that he often has a hard time spotting these anomalies when using LLMs at work, because chatbots “often tell you what you want to hear, making up things that are not only plausible but also helpful if they are true; a practice researchers call 'smooching.'”

Unreliable answers are a hindrance to the widespread adoption of AI chatbots, as they “may pose a risk to human lives,” especially in medical fields like radiology, the researchers say, and could lead to false legal precedents and fake news.

Not everyone is convinced that using a chatbot to rate the responses of other chatbots is a great idea.

In an accompanying News and Views article in Nature, Karin Barspool, a professor of computer technology at RMIT University in Melbourne, Australia, said there were risks in “fighting fire with fire”.

Although the number of errors produced by LLM appears to decrease when the second chatbot groups answers into semantically similar clusters, “using LLM to evaluate LLM-based methods is circular and likely biased,” Verspoor wrote.

“Researchers will need to wrestle with the question of whether this approach truly controls the output of LLMs, or whether it is unintentionally adding fuel to the fire by layering multiple systems that are prone to hallucinations and unpredictable errors,” she added.

Farquhar sees it as “like building a wooden house with wooden cross beams for support.”

“It's not uncommon for reinforcement pieces to support each other,” he said.

Source link

Binance推荐码 commented on BITS Pilani unveils ‘Rakesh Kapoor Innovation Centre’; aims to revolutionise future of education: Thanks for sharing. I read many of your blog posts
b"asta binance h"anvisningskod commented on IP Basics: Copyright Law (Podcast) – Copyright: I don't think the title of your article matches th
binance konto commented on AI And The Channel: It’s Go Time: Thanks for sharing. I read many of your blog posts
小艾彩票平台 commented on Create the content you envision: Hello, for all time i used to check blog posts her
天天官网 commented on 10 AI Applications to Streamline Business and Customer Experiences: After looking into a few of the blog posts on your

Can you use an AI chatbot to ensure that other chatbots’ answers are correct?

Get caught up in

Leave a Reply

RECENT POSTS

Cross-domain transfer learning strategy enhances interpretability of deep learning model explanations

Amazon CEO Andy Jassi meets Maharashtra CM Fadnavis to chart digital future in Mumbai

AI to plan your day and manage your work — Quasa

Get caught up in

Related Posts

Leave a Reply