4 tips to help businesses build better AI agents they can trust

aiagent11gettyimages-2263463626 — Ekaterina Demidova/Moment via Getty Images

Follow ZDNET: Add us as your preferred source Google.

Important points of ZDNET

Companies are considering AI agents in a variety of ways.
Professionals should consider how to take advantage of these technologies.
Measurement, collaboration, and experimentation are key.

AI agents will impact every professional role. If your company hasn’t started using agents yet, you’ll soon be doing so using either off-the-shelf software products or in-house tools that leverage large language models and data sources.

Professionals considering how to use agents in their roles are encouraged to seek guidance on best practices. One such source is Joel Fron, CTO of Thomson Reuters Labs, which helps information services companies leverage generative AI, machine learning, and agent technologies.

Also, are you worried that an AI agent will replace you? 5 ways to turn anxiety into action at work

Fron told ZDNET that Thomson Reuters is driving AI innovation through a combination of in-house models and off-the-shelf tools. Similar to Frontier Labs’ advances by big tech companies, Fron and his team are ensuring the company leverages its unique knowledge and assets.

“If you look at the core of what makes us great, it’s our ability to synthesize human expertise and information to make judgments that can be delivered to experts,” he said.

“The delivery mechanisms of how expertise is delivered are now evolving. Traditionally it has been delivered through software, but increasingly it is delivered through agents, or agents and software.”

Fron points to several key agent accomplishments at Thomson Reuters, including Westlaw Advantage, an AI-powered legal research tool, and the company’s Deep Research agent, which reviews insights and strategizes like a researcher.

Also: MIT study finds AI agents are fast, loose and uncontrollable

From these studies, Fron said, they learned four important lessons that experts can use to build trustworthy agent AI systems.

1. Measure success

Fron said the first focus is on evaluation: “We need to know what’s good.”

Focusing on evaluation sounds like an obvious requirement, but getting it right, quantifying it, and codifying it is a difficult process, Fron says.

“We’ve been saying for the past three years that this is one of the most important things for building good AI systems, and that continues to be true today in the age of agents,” he said.

Fron: “We still need the trust of human experts.”

Thomson Reuters

Fron’s team tracks and measures agent success in several ways. First, it leverages public benchmarks, which he said are a good early indicator of a new model’s potential positive performance.

Related Article: 5 Security Strategies Companies Can’t Mistake in the Age of AI – Why They Matter

Second, they have developed their own internal benchmarks with strong direction for automatic evaluation. “Rather than just saying, ‘How close is the generated answer to a good answer?’ our process is focused on actually defining, ‘Well, what makes an answer good?'”

Finally, Thomas Reuters keeps humans up-to-date and ensures ratings that go beyond automated ratings.

“Automated evaluation helps drive the flywheel faster for development teams, and that’s good because it allows us to test many ideas relatively quickly. But we need the trust and performance evaluation of human experts before shipping,” he said.

“By continuing to rely on that approach, we’ve been able to ship a great product that performs well in the marketplace. I think human input is a key element in helping us do that job well and with confidence.”

2. Have an expert present

Fron advised professionals to deeply understand what their agents do and how they function over the long term.

“It’s becoming increasingly important to tie that perception closely to the user experience,” he said. “If you think of these agent systems as human AI collaborators, humans and agents need a common language and a common interface.”

Related article: Why enterprise AI agents can be the ultimate insider threat

Fron said this common language and interface gives humans valuable insight into the agent’s thought process, and vice versa.

“This is a new and important space for UI experiences, and we believe it is important to tightly couple deep technical understanding of agents with a great user experience.”

While many experts talk about the importance of human-agent collaboration, Fron said the key to success is simple and clear: uniting teams within a business.

“This process is not scientific; it forces designers to sit down with data scientists and talk about what’s going on,” he says. “The more you can bring these two sets of people closer together, and the more often they sit together, the better the penetration of thinking across these two areas will be.”

3. Develop proven capabilities

Despite all the hype that would lead us to believe otherwise, experts need to realize that agents and the models that run them are far from omniscient, Fron said.

Fron said AI models are being improved across three aspects: writing code, executing plans, and making multi-step inferences. Recent advances have made it possible to extend the functionality of the model with other software tools.

“What this development means for us as a company is more positive than negative, because if we can take all the hundreds of applications we’ve sold to the market over the decades and disassemble them, it means we have proven professional capabilities,” he said.

Also: 90% of AI projects fail – here are 3 ways to prevent your project from failing

“If we can break down these elements into tools for agents, we can actually greatly extend the capabilities of these models, and that’s really the future of agents.”

Rather than viewing agent AI as an omniscient model that tries to do everything under the sun, Fron advised experts to give agents access to proven capabilities that people are already using. This is his team’s focus.

“We look at our systems and ask ourselves, ‘Okay, we’ve been building this for human users for years. Now, what kind of ergonomics do we need for agents to operate this system? How do we adapt our processes to lend themselves to operation with agents, not necessarily humans in every case? And what does that approach mean for the way our tools look, feel, and behave?’

4. Look beyond the firewall

Thomson Reuters Labs recently launched the Trust in AI Alliance, a builder-led forum for senior AI researchers from Anthropic, AWS, Google Cloud, OpenAI, and Thomson Reuters to discuss how trust can be built into agent systems.

Fron said the alliance publicly shares lessons learned to inform the broader industry conversation about trustworthy AI, and also helps senior members of the team learn best practices from industry pioneers.

“We’re trying to focus on explainability and transparency in terms of how these models work,” he said.

Related article: 5 ways you can stop testing AI and start scaling it responsibly

Fron said the technology pioneers and their models have significantly reduced the time and effort required to go from zero to 90% accuracy.

“But we’re not in 90% of the fight,” he said. “We’re in a battle between 99% and 99.9%, and we have to think about how we can make the 99 even more accurate. That’s the difference in trust.”

As part of this process, Thomson Reuters also works with academic institutions. Late last year, the company announced a five-year partnership to establish a Frontier AI collaborative research lab at Imperial College London.

“In these efforts, we focus on that last 29 of precision, because that’s what people are looking to buy from us when we release products to the market,” Fron said.

“Frontier technology organizations will continue to push the boundaries of what’s possible. But for us, margins make or break competitiveness in the legal, tax and compliance world. So that’s what we really need to get right.”

Source link