Google says its AI Overview doesn't hallucinate after telling users to put glue on pizza

AI News


Google's recently launched AI-powered search feature, AI Overviews, came under scrutiny from US netizens last week. Users who accessed the new feature accused it of generating strange responses, ranging from telling people to use glue to stick cheese on pizza to listing the health benefits of running with scissors in hand. Now, the tech giant has come out to explain the entire scenario, saying that AI Overviews behaves differently from a chatbot and does not cause hallucinations. The company also added that many of the screenshots shared by social media users are also “fake.”

“Last week, strange and erroneous summaries – many with fake screenshots – were shared on social media. We know people trust Google Search to provide accurate information, and we don't hesitate to call out when we see something odd or incorrect in our rankings or other search features. Like our users, we hold ourselves to high standards, so we expect, appreciate, and take your feedback seriously. Given the attention the AI ​​summaries received, we want to explain what happened and what steps we've taken,” the company said in a blog post.

What is AI Overview?

Before we talk about what happened with Google's latest feature, let's understand a bit about what AI Summary is. In a blog post, Google explains that AI Summary is designed to help users with more complex searches by providing detailed summaries and relevant links for further exploration. Unlike traditional chatbots that generate responses based solely on training data, AI Summary is integrated with Google's core web ranking system. This ensures that the information provided is relevant and backed by high-quality web content.

So while chatbots like ChatGPT and Gemini answer questions based on the data they’ve been trained on, AI Overviews uses data available on the internet, similar to a Google search.

Google's explanation

Google clarified what happened last week, saying in a blog post that it had rigorously designed AI Overview to optimize its accuracy and had conducted extensive testing before releasing the feature. This comprehensive testing included “thorough red teaming, evaluation with a sample of common user queries, and testing on a portion of search traffic to evaluate performance.” The tech giant also acknowledged that real-world usage by millions of people had uncovered many new searches, including nonsensical searches designed to produce false results.

“In addition to designing AI Overview to optimize its accuracy, we tested the feature thoroughly before launch, including a robust red teaming effort, evaluating it with a sample of common user queries, and testing it on a portion of our search traffic to ensure performance. But nothing beats millions of people using the feature across many new searches,” the company said in a blog post.

Fake screenshots and unusual queries

Many social media users shared screenshots of conversations with AI Overview and criticized the new feature. The company said that many of these screenshots were fake and that such responses had never appeared on AI Overview before. On the other hand, some of the responses were inaccurate, but in many cases the questions were also unusual, such as “How many stones should I eat?”

The company said in a blog post that many of these “fake screenshots” were widely shared. Some of these fabricated results were obvious and ridiculous, while others suggested Google returned dangerous advice on topics like leaving dogs in the car, smoking while pregnant, and depression. Google clarified that it had never shown these AI summaries and encouraged users to verify the claims by searching for themselves.

Despite the overall success, Google acknowledged that some AI summaries appeared that were bizarre, inaccurate, or unhelpful. These cases, which generally involved rare queries, highlighted specific areas that needed improvement. One such area was the ability to interpret nonsensical queries or satirical content. For example, the query “How many rocks should I eat?” was virtually nonexistent before it went viral. The query indicates a “missing data” or “information gap” with limited quality content. In this case, satirical content on a geology software provider's website was mistakenly surfaced in the AI ​​summary.

Google further explained that some of the AI ​​summaries contained sarcastic or troll-like content from discussion forums, which can be valuable sources of trusted, first-hand information, but can also lead to less-than-helpful advice, such as using glue to keep cheese on pizza. Additionally, in some cases the AI ​​summaries misinterpreted the language of web pages, resulting in inaccurate information.

Google said it acted swiftly to address these issues, including improving its algorithms and using established processes to remove non-compliant answers.

Issuer:

Divyanshi Sharma

release date:

May 31, 2024



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *