AWS rolled out a series of updates to its cloud operations services this week. All of this is aimed at making it easier to manage multi-cloud and multi-environment infrastructure as enterprises prepare for AI.
New features announced at the annual AWS re:Invent 2025 conference included a multicloud interconnection service launched in collaboration with Google based on the new OpenAPI specification. A new set of highly abstracted features for Elastic Kubernetes Service (EKS). Simplified multi-account, multi-region observability data management, and new AIOps automation capabilities.
The catalyst for all of this change is the emergence of enterprise AI apps and agents, which correlate with large-scale multicloud deployments and involve an explosion of operational data, according to Monday's keynote by Nandini Ramani, vice president of cloud operations at AWS.
“Agents certainly help increase efficiency and simplify things a lot, but they also create certain operational challenges,” Ramani said. “Operational complexity is only going to increase. You're managing microservices, you're introducing distributed agents and event-driven architectures, and you're managing all this across dozens, if not hundreds, of accounts and multiple regions.”
Throughout presentations during re:Invent, AWS officials emphasized their desire to support customers wherever they run their AI workloads, even if it's not in the AWS Cloud. According to IDC analyst Matt Frug, this marks a break from tradition for AWS. AWS has traditionally focused on getting enterprise customers to go “all in” on its cloud.
“This is relatively new, especially for AWS, but really for all three hyperscalers,” Flug said.
AI apps are just one part of the broader motivation for this strategy change, Frug added. High-profile outages at Google, AWS and Azure this year, as well as new EU regulations requiring digital resiliency, have also increased enterprise interest in multi-cloud infrastructure, he said.
“Clearly, Google and AWS [the interconnect API] “It's been going on for a long time, even before the disaster, but I think more companies are using multi-cloud, and even more people are planning to use a multi-cloud strategy,” he said. “From a resiliency perspective, we're hearing a lot from our clients.”
AWS Vice President Nandini Ramani will deliver a keynote focused on cloud operations at re:Invent 2025.
AWS Interconnect simplifies multicloud networking
One of the most notable changes from AWS's traditional closed approach to cloud infrastructure, spurred by the new AI gold rush, was the rollout of multicloud AWS Interconnect, which is in preview. The service was designed in collaboration with Google using new open source API specifications and is aimed at simplifying the configuration of multi-cloud networks, according to a joint blog post.
“Previously, connecting to a cloud service provider required customers to manually configure complex networking components, including physical connections and equipment. This approach required long lead times and coordination with multiple internal and external teams,” the post said. “You can now provision dedicated bandwidth on demand and establish a connection in minutes through your favorite cloud console or API.”
According to one IT leader, companies that have already made significant investments in multicloud infrastructure management for AI will welcome assistance with manual configuration.
“When it comes to AI and providing generative AI models to power our AI services, we use the right model for the task, which can be difficult because no single provider has a model that works for every task in the 60+ languages we support and every geographic region we operate in,” said Ian Beaver, chief data scientist at Verint, a contact center-as-a-service provider in Melville, New York. said in an email to Informa TechTarget.
“Therefore, some product services run on one hyperscaler and the AI models used are multi-cloud deployments served from another hyperscaler. [Amazon] Bedrock or Azure OpenAI,” Beaver said. “This requires setting up secure cloud-to-cloud networking, which can be time-consuming to deploy. Automation created around setting up secure networking across hyperscalers is welcome and reduces both complexity and deployment time when configuring new regions.”
Another joint customer of Google Cloud and AWS highlighted the resiliency benefits of the new interconnected services.
“I think the EU resiliency regulations explain this better than ever,” said David Strauss, chief architect and co-founder of WebOps company Pantheon. “This is where these network tunnel integrations shine when you unify layers of your application stack across the cloud.” “We use about 95% GCP, but some resources are on AWS. I've been managing routing and tunnels, so I welcome this effort. This is a huge project to achieve secure, low-latency, reliable links between data centers. There's very little that's unique to each implementation, but somehow it doesn't feel like it.”
Jim Fry
AWS is taking the Interconnect specification one step further this week with another service in gated preview called AWS Interconnect – last mile.
“Informa TechTarget's Omdia analyst Jim Frey said: “Interconnect – Last Mile offers the same rapid deployment of AWS Direct Connect to customer facility locations such as private data centers and campuses – essentially an automated WAN.” “Lumen is the first ISP they partner with, but it will require a lot of partnering with the ISP provider community.”
Rob Strechay, an analyst at TheCube Research, said cost must also be considered for interconnects to become a reality for mainstream enterprises.
“Evidence will be in the form of monthly bills,” he said. “It would be interesting to see the potential pricing tiers, but if your company is already doing this cross-cloud, it could be nominal.”
EKS capabilities move management “above the stack”
The three new options for EKS users that became generally available this week also follow the overall theme of simplifying cloud operations, “allowing you to focus on deploying applications rather than maintaining platform infrastructure,” according to the AWS website.
EKS functionality includes managed services for GitOps workflows using Argo CD. AWS Controller for Kubernetes. Provides cloud infrastructure management within the EKS control plane. Kube Resource Orchestrator can be used by platform teams to create custom resource templates. All three are based on open source projects, but are focused on AWS infrastructure, as opposed to other open source counterparts designed for similar purposes that support multiple clouds, such as Crossplane and KubeVela. Commercial products such as Red Hat OpenShift also support large-scale multi-cluster management in hybrid clouds.
This is essentially a race to onboard as many AI workloads as possible and show people that scale doesn't matter.
Torsten VolkOmdia Analyst
For users who have already invested in EKS, the new features align with broader trends in the Kubernetes ecosystem focused on easier multi-cluster management to support AI scale, said Omdia analyst Torsten Volk.
“They all have the same ambitions,” Volk said. “It's basically a race to onboard as many AI workloads as possible and make sure to show people that scale doesn't matter, because we've heard that all of these workloads have incredible resource requirements, and none of the three big cloud vendors want to have any suspicion that maybe they can't handle it.”
As with interconnects, price will be a factor in the adoption of EKS capabilities, Strechay said. For example, according to the EKS pricing website, the Argo CD feature has a base rate of $0.02771 per hour, plus a per-application rate of $0.00136 per hour. If you have 100 applications, Argo CD costs approximately $130 per month. A thread on Reddit this week drew comments opposing the move, calling it “atrocious markup.”
“If you use it significantly beyond that, [applications] In a development environment, fees can easily run into the tens of thousands of dollars per month. ” says Strechay. However, at enterprise scale, it is much more cost-effective to implement your own service. ”
Volk said that in the long term, it's clear that AWS aims to make AI agents the primary management layer for EKS, as evidenced by this week's release of the EKS Model Context Protocol (MCP) server and DevOps frontier agent.
“They see these agents as super-orchestrators, and they want developers to be able to manage EKS through agents so they don't have to access CloudFormation or CloudWatch,” Volk says. “Instead, you can capture it at one level above the abstraction, at the agent level, so you don't have to know exactly which metrics you need to look at. Based on the context, it will tell you which Prometheus metrics you need to collect, etc. That's what they're trying to do.”
CloudWatch observability, AIOps updates
But in the meantime, AWS customers are still relying directly on cloud operations tools to manage their AI applications, with a series of updates to Amazon CloudWatch and AgentCore Observability coming this week.
These updates include:
Observability of generative AI Support for latency, token usage, errors, etc. Compatible with agent frameworks such as LangChain, LangGraph, and CrewAI.
CloudWatch Application SignalsFirst released in September, it creates application topology maps without the need for application instrumentation, but now automatically groups resources based on their corresponding applications.
CloudWatch investigationReleased in June, AWS now provides automatic generation of incident reports that include a “5 Reasons” list for post-mortem analysis of internal incidents.
MCP server For CloudWatch and Application Signals.
General availability of CloudWatch application notifies GitHub actionsThis automatically associates CloudWatch telemetry with your application's source code.
In his presentation on these new features, Ramani also highlighted multi-cloud support for telemetry data for AI applications.
“This is true no matter where you choose to host your agents,” she said. “You can instantly get complete visibility into latency, token usage, and performance across all your AI workloads.”
Scaling AI data and storage management
According to Ramani's presentation, Agentic AI is also driving demand among AWS customers for simplified storage and data management at scale.
“Today, our AI agents operate 24/7, processing thousands of customer requests every hour, and each of these requests generates a tremendous amount of telemetry throughout the day,” she said. “It's not just more data; it's an exponential increase in surface area that needs to be both monitored and protected.”
In response, AWS has issued multiple updates for observability and security data management, including natural language processing support for data pipelines. Cross-account and cross-region log data collection. Support for RDS and Aurora databases and CloudWatch Database Insights across multiple accounts and regions. Aggregated events for CloudTrail audit logs.
All of these incremental improvements add an important foundation for multicloud AI management, according to IDC's Flug.
“With AI, everyone has access to a single source of truth, data, [but] We need to make it easy for people to do that. If they have to build a physical connection themselves, which takes months, that doesn't really help them get started using AI. ” he said.
Beth Pariseau, senior news writer at Informa TechTarget, is an award-winning veteran of IT journalism covering DevOps. Any tips? send an email to her or reach out @PariseauTT.