This interview analysis Hitachi Vantara And it was written, edited and published according to our regulations. Emerj Sponsored Content Guidelines. Learn more about our thought leadership and content creation services. Emerj Media Services Page.
Poor data quality significantly reduces the performance of machine learning models by introducing errors, biases, and inconsistencies that propagate throughout the pipeline, reducing accuracy and reliability. Research published by the University of Amsterdam in the Netherlands demonstrates that key quality aspects such as accuracy, completeness, and consistency directly impact predictive power.
The paper points out that training models based on flawed data can produce erroneous results that negatively impact business operations, resulting in financial losses and damage to an organization's reputation. In high-stakes sectors such as finance and healthcare, even small degradations due to poor data quality can lead to costly or harmful business decisions, thereby limiting the trustworthiness and trustworthiness of large-scale AI systems.
Poor data quality and infrastructure limitations are among the most expensive hidden costs businesses face. According to a report from Hitachi Vantara, by 2025, large organizations will need to process nearly double the amount of data they currently have, averaging over 65 petabytes. However, 75% of IT leaders are concerned that existing infrastructure cannot scale to meet these needs due to limited data access, speed, and reliability, directly impacting the effectiveness of AI. These challenges result in wasted time, inefficient decision-making, and increased operational costs.
On a recent episode of the AI in Business podcast, Emerj Editorial Director Matthew DeMello spoke with Sunitha Rao, Senior Vice President of Hybrid Cloud Business at Hitachi Vantara, to discuss the infrastructure and data challenges of scaling AI and how to build reliable and sustainable workflows to overcome them.
This article reveals two key insights for organizations looking to scale AI effectively.
- Optimize your data for performance and reliability: Prioritizing data quality, freshness, and governance while implementing anomaly, PII, and redundancy checks enhances workflows and prevents costly errors.
- Prioritize intelligent, supervised, and sustainable AI workflows: Defining meaningful SLOs and strategically placing workloads optimizes performance, cost, and sustainability.
guest: Sunitha Rao, Senior Vice President, Hybrid Cloud Business, Hitachi Vantara
Expertise: Business strategy, cloud computing, storage visualization
Easy recognition: Sunitha leads innovation and strategic growth at Hitachi Vantara, delivering innovative cloud solutions. Previously, he has worked at NetApp and Nimble Solutions. She completed her Master's degree in Business Administration from the Indian Institute of Management, India.
Optimize data for performance and reliability
Sunitha kicked off the conversation by listing some key challenges in scaling AI and highlighted critical infrastructure demands. She explains that unstructured data is often spread out in silos, creating complex governance and compliance hurdles. The instinct to simply add more GPUs or data centers doesn't work, she points out, because the lack of hardware and power, cooling, and sustainability limitations quickly create bottlenecks.
Distributed workloads require low-latency, high-bandwidth networks, but legacy storage systems struggle with AI read/write patterns and require an integrated, scalable solution. Hybrid and multicloud environments also require an optimized MLOps pipeline.
Finally, Roa emphasizes that rising costs make ESG alignment and clear ROI essential. Strong leadership and an AI-enabled platform with flexible compute, storage, auto-tiering, and integrated MLOps are essential to address these gaps.
Rao went on to emphasize that poor quality data is very costly, especially for AI at scale, summarizing this as follows:Garbage goes in, expensive garbage goes out.”
She points out that this problem can be addressed by implementing robust workflows early, such as:
- Evaluate the lower and upper error bounds. Understand the full range of errors in your data.
- Handle noisy or duplicate data. Identify and manage redundant and irrelevant inputs.
- Monitor gradient variance: Verify that your dataset does not introduce instability in training your model.
- Ensure a high-quality data framework. Clean and diverse deduplicated data improves performance, especially when out of distribution.
- Addressing safety and stigma: Poor quality or biased data can amplify security risks, magnify breaches, and increase costs during training/testing cycles.
Rao goes on to unpack the importance of improving AI data workflows by focusing on quality over quantity.
“Instead of building a bigger haystack, we should be looking at how to build in better needles within the system. That's when we improve the degradation aspects of data flows. I think it's essential to consider data freshness and quality gates. For example, consider streaming ETL.
Knowing what kind of information is being used requires things like schema checking, anomaly detection, and PII. Therefore, we are considering implementing PII data services. Basically, it's about thinking about how to eliminate these quality gaps, add more stops before training and service, and how to create a seamless workflow without skewing the data. ”
– Sunitha Rao, Senior Vice President, Hybrid Cloud Business, Hitachi Vantara
Prioritize intelligent, supervised, and sustainable AI workflows
Rao will discuss the importance of monitoring, repeatability, and service level objectives in AI workflows. She emphasized that early detection requires continuously tracking datasets, creating root cause alerts and playbooks, and moving beyond traditional threshold-based scripts to self-learning models that adapt every step of the way.
Tracking datasets, features, models, and code versions is important for rebuilding models, learning from past failures, and systematically addressing problems. Finally, she emphasized that SLOs should go beyond simple metrics like latency. To ensure a reliable, resilient, and continuously improving AI infrastructure, you must define and monitor meaningful SLOs and proactively address violations.
Sunita explains that SLOs are the foundation of an AI infrastructure and serve as commitments defined for each workflow to prevent degradation. SLOs provide a framework for customers to understand what they can reliably deliver across their training, service, and data pipelines.
Once these goals are set, the focus shifts to improving results and ensuring seamless execution across offline/online processes, batch workflows, search systems, and vector store pipelines. She emphasizes the need for regular KPIs to track metrics such as data freshness, training and service skew, and pass/fail rates. Monitoring these metrics can help you identify where degradation is starting, allowing your team to implement appropriate controls and ensuring a reliable, high-performing AI system.
Finally, Sunita talks about the importance of mapping workloads to the right place to run, whether on-premises, public cloud, edge, or hybrid. Because these decisions determine investment, ROI, performance, compliance, and sustainability. Deciding where to store your data allows you to design power-efficient infrastructure, tiered storage, and carbon-conscious operations.
“When we started talking about carbon awareness, people are now calling it the use of carbon like cash. This is an important part of building ROI and sustainability, and you need a policy engine that defines the consequences of where data should go. This approach helps leaders align ROI with a true sustainability framework by weighing infrastructure costs, carbon savings, and performance delays.
– Sunitha Rao, Senior Vice President, Hybrid Cloud Business, Hitachi Vantara

