The race to leverage AI is changing. Success is no longer determined by who adopts and deploys AI first, but by whether the data infrastructure is ready to support the technology at scale.
A new survey of more than 1,200 IT leaders across industries and geographies from hybrid AI and data platform Cloudera finds that while respondents are confident in their data strategy, gaps in data readiness are actually hindering ROI.
79% of respondents said their efforts are hampered by limited access to data across their environment. Another global study conducted by Harvard Business Review Analytic Services in collaboration with Cloudera found that only 7% of companies say their data is fully AI-ready. Meanwhile, Gartner predicts that organizations will abandon 60% of AI projects this year due to a lack of AI-enabled data.
“Taken together, these findings reveal a significant disconnect between where organizations want to be with AI and where they actually are,” said Sergio Gago, chief technology officer at Cloudera.
“By investing in data preparation as a strategic initiative, organizations can close that gap and scale efficiently to derive lasting business value from AI.”
Cloudera’s research highlights some key challenges that still exist even as organizations advance their current data and AI strategies. Here’s what they are and what your company can do to address them.
Strengthen governance and fix data quality
The power of AI is determined by the data used to train it. Gaining a competitive edge means moving beyond general-purpose LLMs trained on public data to models based on proprietary models. Second, and more importantly, make sure your data is of high quality.
Cloudera research shows that IT leaders have a false sense of confidence in the quality of their data. 84% of respondents believe in the accuracy, completeness, and integrity of their organization’s data, while 30% cite data quality as the main reason why AI projects fail to achieve ROI.
This also includes governance issues. Less than one in five respondents said their data was fully under control. Additionally, 71% said their data is mostly under control, but gaps can reduce AI output. Modern AI tools pull information from everything, including documents, emails, meeting notes, and more, not just clean databases. That means you need to manage all your data: structured data, unstructured data, and data in between.
A less platform-dependent approach to governance is a good place to start. Ideally, you can define governance policies once and apply them wherever your data resides, without the need for separate systems or manual monitoring.
Make data easily accessible to teams and tools
Well-managed, high-quality data is useless if it cannot be accessed by humans or AI tools. However, Cloudera’s research found that access and integration remain major bottlenecks.
Even among technology organizations that have made significant investments in cloud and modern data platforms, 56% of respondents said they lack complete access to their data. In fact, the rapid adoption of new technology may be part of the problem.
Every time a new platform is added without a unified strategy, new silos are formed and data becomes fragmented across the system. According to 34% of IT leaders, this is the biggest problem preventing their teams from using data effectively. Integration also remains a concern, with only 30% of IT leaders saying their data sources are fully integrated across systems.
This leaves teams and tools working with different and incomplete versions of the same data, making it difficult to achieve accurate results. The organizations that are moving forward are those that have reversed that model. In other words, we are building a more mature architecture where data is the foundation and everything else rests on top of it, rather than a byproduct of the tools in which it was generated.
Understand your organization’s unique barriers to data preparation
While these are powerful best practices, there is also no universal handbook for data preparation. Cloudera’s research highlights that different industries face different barriers.
For example, technology companies, the public sector, and telecommunications organizations agree that data quality is their biggest challenge, while energy and utilities companies are more concerned about cost overruns. Meanwhile, healthcare, manufacturing, and financial services struggle the most with weak integration into workflows.
Leadership-level challenges may also exist. Other common barriers to effective data use include complex access requirements and processes, inadequate training and data literacy, and cultural resistance to data sharing. Preparing for the future starts with an honest look at where your current data is lacking.
“Our research reveals that although most leaders recognize the importance of data readiness, significant structural, cultural and governance challenges continue to hold them back,” Gago said.
“The companies that actually move forward will be the ones that aren’t afraid to admit their weaknesses and treat improvement as a strategic priority rather than an IT to-do list.”
Learn more about how Cloudera helps organizations ensure their data is ready for an AI-driven future.
This post was created by Insider Studio With cloud dera.
