Amazon adds tools to scale AI applications and improve accuracy

Amazon is adding more ways to make generative AI applications easier to create, easier to deploy, and more accurate.

Amazon Web Services yesterday at its AWS Summit New York event announced new ways to create enterprise-grade AI apps while also improving the accuracy of large-scale language models, a key hurdle in attracting companies wary of the “hallucination” problem of various LLMs.

One of the added features is contextual grounding checks, a technique that evaluates AI-generated answers by cross-referencing source material in real time. Because companies across industries and data types may have different tolerances for accuracy, grounding checks also measure relevance and block answers based on a company's tolerance.

Another new feature from AWS is the Guardrails API, which evaluates user prompt inputs and AI model responses for various LLMs in Amazon Bedrock, or for enterprises' own LLMs. This API helps enterprises identify content based on their policies and redact sensitive information, filter harmful content, and block undesirable topics and inappropriate content.

” [API] “Requests can now be customized to be more specific so that the output is appropriate for that request and that input,” Diya Wynn, head of AI at AWS, told Digiday. “The key here is that it gives customers an extra layer or level of security, and the API provides that for any LLM, not just the one that was in Amazon Bedrock.”

In Amazon's testing, contextual grounding checks found and filtered up to 75% of hallucinations in AI model responses, and blocked up to 85% more content when used in conjunction with the Guardrails API. Speaking onstage at AWS Summit NY, Matt Wood, vice president of AI products at AWS, said AI models trained on the public internet use a “very wide range” of datasets compared to the types of datasets and document formats used by enterprises.

“This information is typically quite shallow compared to the depth of information that most organizations deal with on a daily basis,” Wood says. “As you dig deeper into these models of the world, they become a bit like Swiss cheese. There are areas that are dense and areas that are sparse. And when the information density is high and the model has context, the model works really well.”

The updates are just two of many announced at AWS Summit NY, where the e-commerce giant also announced other capabilities for its generative AI platform, including the debut of the new AWS App Studio, which enables business customers to create AI apps from text prompts, and the expansion of Amazon Q Apps, which lets customers build their own AI apps.

Amazon's efforts are one example of the many ways AI model providers are racing to find ways to make generative AI tools easier to use, more useful and more accurate. This week, AI startup Writer released a new upgrade to its proprietary AI platform. Using Search Augmentation Generation (RAG) with a graph-based approach, Writer introduced a new way to incorporate RAG into the process of analyzing up to 10 million words when developing a chat app. The four-year-old company also introduced updates that allow AI models to explain the process of generating answers (a major industry challenge in improving explainable AI), and new “modes” for customers when reviewing documents for different tasks.

Users don't naturally trust answers that come out of a black box, explained Deanna Dong, head of product marketing at Writer, who added that while generative AI may be “magical,” it's still not a “magic bullet that solves everything.”

“We've found that one-size-fits-all chat apps with open-ended prompts don't always produce the best outcomes for users,” Dong told Digiday. “There's a lot of confusion and they tend to rely on the user to be an expert on the prompts.”

One of the challenges of adopting AI is that companies don't always know what they need or want to build, said Carli DeFilippo, executive vice president of experience at AWS agency partner MediaMonks. That requires providing more examples of what's possible while easing brands' fears about what will happen if something goes wrong with their AI implementation.

“If we get a synopsis or something, [a client] “You can't just go ahead and do it if you're not ready,” DeFilippo said. “It's almost like analysis paralysis. They know they want to get in there, they've been given KPIs, they know their boss is toughing them out, but it's like they have so many options at this point that they need to know exactly what's right and what choice is right for them.”

Source link