Google's AI profile has a design flaw, and a new blog post from the company suggests why.

AI For Business


A selection of mascot characters created by Google.
Expanding / Google's “G” logo is surrounded by whimsical characters, all looking shocked and stunned.

On Thursday, Google capped a grueling week of providing inaccurate and sometimes dangerous answers through its experimental AI Overview feature with a follow-up blog post titled “AI Overview: Last Week.” In the post, reportedly written by Liz Reid, Google vice president and head of Google Search, the company formally acknowledged the feature's problems and outlined steps it had taken to improve a system that appears to be flawed by design, even if it didn't realize it was acknowledging them.

To recap, the AI ​​Summary feature, which the company showed off at Google I/O a few weeks ago, aims to provide search users with summarized answers to their questions using AI models integrated into Google's web ranking system. It's an experimental feature that's not enabled for all users at the moment, but when participating users search for a topic, they may see AI-generated answers at the top of the results, pulled from highly ranked web content and summarized by the AI ​​model.

While Google claims that this approach is “highly effective” and on par with featured snippets in terms of accuracy, the last week has seen numerous examples of the AI ​​system generating strange, inaccurate, or potentially harmful responses, detailed in a recent feature by Ars reporter Kyle Orland that reproduced many of the anomalous outputs.

Drawing inaccurate conclusions from the web

On Wednesday morning, Google's AI overview incorrectly said that Sony's PlayStation and Sega Saturn were launched in 1993.
Expanding / On Wednesday morning, Google's AI overview incorrectly said that Sony's PlayStation and Sega Saturn were launched in 1993.

Kyle Orlando / Google

In response to the AI ​​Overview example going viral, Google apologized in a post, stating that “Like our users, we hold ourselves to high standards, so we expect, appreciate, and take your feedback seriously.” However, in an attempt to justify the mistake, Reid went into great detail explaining why AI Overview provides false information.

AI Summary works in a completely different way than chatbots and other LLM products people may have tried. It doesn't simply generate output based on training data. It's driven by a customized language model that's integrated with our core web ranking system and designed to perform traditional “search” tasks, like identifying relevant, high-quality results from our index. So, AI Summary doesn't just provide a text output, it also includes related links to encourage users to explore further. Because accuracy is paramount in search, AI Summary is built to only surface information that is supported by the top web results.

This means that AI summaries generally do not “hallucinate” or falsify facts like other LLM products.

This is where the system's fundamental flaw becomes apparent: “AI summaries are built to only show information that is supported by top web search results.” This design is based on the false assumption that Google's page ranking algorithm prioritizes accurate results and not SEO-manipulated garbage. Google Search has been broken for some time, but now the company is relying on manipulated, spam-filled results to inform its new AI models.

Even when an AI model pulls information from a more accurate source, like the 1993 games console search above, Google’s AI language model can make inaccurate conclusions about the “accurate” data, crafting misinformation with an incomplete summary of the available information.

Google's blog post largely ignores the folly of basing AI results on a broken page-ranking algorithm, instead attributing common errors to several other factors, including users conducting meaningless searches “with the intent of producing false results.” Google acknowledges flaws in its AI models, including misinterpretation of queries, misinterpretation of “nuances in language on the web,” and not enough quality information on certain topics. It also suggests that some of the egregious examples circulating on social media are fake screenshots.

“Some of these fake results are obvious and ridiculous,” Reid wrote. “Some even implied that they returned dangerous results on topics like leaving dogs in cars, smoking while pregnant, and depression. No AI summaries of these were shown, so we encourage anyone who comes across these screenshots to search and check for themselves.”

(It’s worth noting that while some of the social media examples are undoubtedly fake, attempts to recreate the early examples now would likely fail, as Google has manually blocked the results. And it may be evidence of just how broken Google Search is that people would believe the extremely false examples in the first place.)

In his post, Reed touched on the “meaningless search” angle, citing the search “How many rocks should you eat a day” that went viral in a tweet on May 23. Reed said, “Before these screenshots went viral, very few people had asked Google that question.” He also said that there wasn't much data on the web that answered that question, so there was a “data void” or “information gap” that was filled with satirical content found on the web, which the AI ​​model found and pushed as the answer, like a featured snippet. So it was basically working as designed.

AI Overview Query Screenshot,
Expanding / A screenshot of the AI ​​summary query that went viral on X last week: “How many rocks should I eat a day?”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *