MIT says 95% of enterprise AI failures – this is 5% doing the right thing

Young happy businessman using laptop computer in modern office with colleagues. We have a stylish and beautiful financial advisor. I visualized the VFX Hologram Editing Stock Exchange interface and opened the chart.

a New Report MIT has sent shockwaves through the world of enterprise AI. According to AI status in 2025 Research and generation of AI pilots offer zero return on investment. The findings are based on 300 public deployments and over 150 executive interviews, suggesting that billions of dollars are spent on AI experiments that are not of any scale, and that most organizations are sticking to what MIT researchers call “Genai divide.”

The numbers are strict. 40% of organizations say they deployed AI tools, but only 5% were able to integrate them into their workflows at scale. Most projects die in pilot purgatory. Meanwhile, the headline warns of the “AI bubble” and investors are shortening AI stocks with the idea that a big enterprise moment of generation AI is already stagnating.

CEO PROMPTQL, Tanmai Gopal

PRONTQL

But not everyone agrees to that reading.

“The problem is being confident and wrong,” says Tanmai Gopal, co-founder and CEO of the company. PRONTQLunicorn AI company counting Openai, Airbus, Siemens and NASA as its customers. “If the system is not always accurate even at the smallest percentage, you need to know when it's not. Otherwise, my time will change to time. The ROI will disappear.”

Verification tax

In him Blog post, Being “confident and wrong” holds back AIGopal explains what he calls “verification tax.”

“I don't know when I might get the wrong response from the AI. So, all responses need to be forensic checked.”

This tax explains much of what MIT labels as genai splits. Although Enterprise has been enthusiastic about starting the pilot, employees end up spending a lot of time double checking, failing to achieve the promised efficiency.

It's not that generative AI lacks raw horsepower. The models are all dazzling. That means their confidence is merciless. In a regulated or high stakes industry, one bad answer can outweigh the 10 good answers. As Gopal says, “For serious work, one of the confidence is more reliable than gaining ten successes.”

Learning gap

MIT researchers constructed the same problem differently. They found that most enterprise AI tools do not retain feedback, adapt to workflows, or improve over time. Without those qualities they stall.

Gopal agrees. “Without high-quality uncertainty information, we don't know if the results are wrong due to ambiguity, missing context, outdated data, or model errors. If we don't know why we're wrong, we're not investing in success.”

That insight is important as it reconstructs the entire conversation. The fix is not about building a larger model, as the AI has not failed due to lack of capabilities, but is not designed to communicate and learn from the fix.

How PromptQL solves it

PromptQL has built the entire platform around solving this exact problem. GoPal calls the difference between “confident and wrong” and “tentatively correct.”

Instead of presenting the output as a gospel, PROMPTQL adjusts reliability at the response level.

Quantify uncertainty. All answers come with confidence scores. If the system is uncertain, it abstains – effectively says “I don't know.
Surfaces the context gap. Rather than hiding uncertainty, the system flags it why The answer may be unreliable: missing data, ambiguity, or lack of context.
Build a precision flywheel. Each abstain or modification will be training fuel. PromptQL captures these signals and allows for continuous improvement in the system. Close the “learning gap” MIT, which was identified as the biggest cause of pilot failure.
Integrate into your workflow. Instead of sitting in a chat box, PromptQL is embedded directly in enterprise processes such as contracts, engineering, procurement, and so on, so uncertainty flags and fixes appear exactly where the work is going.

“The starting point for this loop is whether an AI system can communicate to users when accuracy is not certain in a concrete, native way,” writes Gopal. That loop – abstaining, correcting, learning – is what he calls the precision flywheel. “You don't need perfection. You need a tight loop.”

What is tentatively right is confidently wrong

This humble first approach was adopted in some of the most skeptical corners of the enterprise market. While 95% of pilots will stall, PromptQL has closed seven- and eight-figure contracts with Fortune 500, government agencies and the regulatory industry.

The company has proven that Enterprise AI has not failed. It's the wrong kind of enterprise AI.

As Gopal states, “the amount of solving other problems — integration, data preparation, organization preparation — will avoid it from real use cases to change the fact that AI is confidently wrong.”

Another conclusion

So takeout doesn't mean that AI is destined to fail. That means companies must demand different types of AI. It is transparent about its uncertainty, closely integrated into the workflow, and can be improved with all interactions.

The MIT report is correct to emphasize the split of genai. But if you focus only on 95% of your failed attempts, you'll miss the 5% you're actually scaling.

Companies that build and employ AI that acknowledge when they don't know, are quietly rewriting their stories. ProsptQL is one of them.

And if their traction is like that, the conclusion is not that Enterprise AI is a bubble. That means a few companies are already finding ways to burst it.

Source link