Less hallucinations can make business adoption of AI faster. American Enterprise Institute

If technology-driven productivity wants to quickly recharge the US economy, here's the problem. Artificial intelligence continues to attract capital faster than actual users. Goldman Sachs' latest adoption tracker shows the use of AI companies generated by 9.7% of US companies in the third quarter. Financial and real estate companies are adding AI at the fastest time, but education is being pulled back. “Broadcasting and publishing companies have reported the biggest increase in AI adoption over the next six months,” the bank notes.

But investing is a different story. Since ChatGPT's debut in late 2022, analysts have increased their revenue outlook for End-2025 for 2025 by $203 billion, about 0.7% of US GDP and $123 billion, or 0.4% of GDP against other hardware suppliers. Furthermore, shipments of AI-related components have risen in the US, Japan and Canada, with some of the recent surges being tied to tariff frontloading.

(Investment bank UBS believes it will splurge on AI infrastructure this year, up to $500 billion in 2026. According to the New York Times, software and computing gear in the second quarter are the largest technology investment cycle since the Internet BOOM in the 1990s.

The outcome of these two trends is a now familiar mismatch. Companies buy AI infrastructure components, but are reluctant to incorporate them into their daily operations. GS: “Recent industry research suggests that concerns about data security, quality and availability are the highest barriers to adoption.”

Perhaps that insight will help explain why large companies, medium-sized businesses with over 250 employees and 100-249 workers, actually saw adoption rates I'll check it down (However, future adoptions were expected to be strong). It also hangs from this phenomenon. A recent MIT study found that 95% of organizations did not see AI investment returns.

My guess about the results: Large companies in particular may be risk averse to large-scale language models that generate hallucinations. If so, Openai's new paper, “Why Language Models Hallucinate,” offers the potential for these confident false diagnosis and treatment. Like students speculating on multiple choice exams, LLMS bluffs and rewards when they are unknown. Rating punishes “I don't know” responses that are more than plausible but incorrect answers. That false incentive will help you maintain an overconfidence and undermine the company's trust. “This creates a 'mocracy' of punishing uncertainty and abstaining. This argues that only a small portion of hallucination assessment is not sufficient,” the author explains.

So what's the correction? The researchers suggest that there is less pathway to inventing new tests than changing the rules of existing rules. These “tests” are benchmarks and leaderboards (such as multiple selection accuracy tests) that use developers to measure whether the model is improving. Now they are rewarding mostly confident answers and giving zero credits for “I don't know.”

As the paper states, “A simple change in mainstream assessment can recalibrate incentives and reward the appropriate expression of uncertainty rather than punish them.” If the model is believed to have said “I don't know” when punished, then there will be far less incentives for bluffs rather than being punished. That shift will help transform AI into more reliable colleagues and will give businesses more confidence to deploy tools they already have purchased.

Until then, the company's attention appears reasonable. Companies are rarely criticized for waiting until technology behaves like trustworthy colleagues than guessers who talk smoothly. However, I believe in the notion that AI today is the worst ever, so I think C-Suite's attention will give way to a more positive embrace.

Source link