According to researchers at MIT, 95% of AI agent pilots never reach production. Most companies continue to be carriers without addressing the real problem: incorrect information and weak evidence. Here’s why the quality of context determines whether an AI agent actually works.
95% problem


AI agents didn’t become stupid overnight. Outdated policies, missing updates, or working memory There is so much cramming that one useful fact gets buried.
This 95% number is the percentage of AI pilot projects that are never actually used on a daily basis by businesses. In other words, the team builds the agent, tests it, maybe demos it to upper management, and then quietly shuts it down before it’s shipped. This number comes from the MIT Media Lab’s NANDA initiative. The title of their report is GenAI fragmentation: State of AI in Business 2025. It was extracted from 52 executive interviews, 153 leadership surveys, and 300 public deployments. However, context alone doesn’t tell the whole story.
Researchers also pointed out weak governance and poor Workflow integration. Internal builds that lacked buy-in were another factor. However, over and over again, outdated or disorganized information came up.
The context is what the model actually operates on
A model needs something to reason about in the first place. In other words, Live data, past interactions and user history. It also means knowing what is currently happening within the system. If you pass that old version, the output will be broken.
Imagine a support agent using a last quarter refund policy. workflow bot It operates on old data. The answer seems confident. It’s wrong because it’s built on false facts.
When more context backfires


Throwing more information at the agent often backfires. model is overloaded And it takes your focus away from what’s important.
Researchers call this “I got lost on the way” effect. Embed facts in the center of long documents. Models tend to miss that. It pays far more attention to what’s at the beginning or end. If you combine enough of these, the sum becomes larger. Now people call this pattern Corruption of context.
Klarna’s AI deployment in 2025 became textbook example: Lots of information, but not the right kind.
Techniques for deciding what to show and what not to show
One of the techniques that is attracting attention here is Context pruning. This means deciding what the agent doesn’t need to see for a particular task. We do not randomly delete information. it’s about filter intentionally So agents can stay focused.
For teams doing this, see Faster response, fewer inconsistencies, and lower costs. Anthropic’s engineers put it simply. Find the minimum set of information that will help you get the job done.
Rapid engineering hits a wall
rapid engineering still importantBut it is no longer the main lever. Clever prompts won’t help here. This is not the case if the agent is retrieving from old or messy data. Language used to be the most difficult part of the job. Now it’s just part of a larger system.
Build an information pipeline that actually works


Context engineering ultimately leads to several decisions. What the model knows, when it knows it, and how it is configured:
- A search system that retrieves only relevant slices of data
- A storage system that preserves useful history without dragging everything down
- Keep your information up-to-date with fresh data pipelines
- Relevance filtering removes noise before it reaches the model.
- State tracking lets agents know what’s going on
A lot of teams are still in the early stages here. It’s worth prioritizing before expanding further.
What this means for your business
Agents need living knowledge It’s not just a folder of static PDFs, it’s a representation of how your business actually operates. Connect them to reliable real-time systems. Please delete any that are inconsistent or outdated.
Three things to prioritize:
- Keep your data up to date. Outdated information produces confidently incorrect answers.
- Remove noise before acquisition. Filter the upstream beat in hopes that the model will classify it later.
- Treat information setup as seriously as security.
conclusion
Reasoning skills still matter. But that no longer separates the winners from the rest. What really separates agents who are shipped from those who are on hold? What are they reading behind the scenes?
FAQ
Why do AI agents fail even though the AI itself is powerful?
Usually it’s the quality of the information given. No matter how competent the model is, incorrect information means the answer is also incorrect.
Why do AI agents sometimes forget things in the middle of long documents?
This is the “lost in the middle” effect. The model ignores information in the middle of long text, favoring information at the beginning or end.
Why would we intentionally try to hide information from an AI agent?
Too much information can distract the model. A technique known as omitting things that are of little use. Context pruningkeeping the agent focused.
Is it still important to write really good prompts?
Yes, but only one. A well-written prompt cannot make up for missing or outdated information.
Why is context engineering so important now?
This year, more companies are incorporating AI agents into real business systems, not just demos. Providing them with clean, up-to-date information is no longer an option.
If companies want to solve this problem, where should they start?
Start with the data. Make it live, reliable, and clean, then build your search and filtering around it.

