AWS announced that Amazon WorkSpaces can now serve as managed virtual desktops for AI agents to interact with legacy desktop applications through computer vision and input simulation without the need for application modernization or API integration.
The issues it addresses are wide-ranging. According to a 2024 Gartner report, 75% of organizations run legacy applications without modern APIs, and 71% of Fortune 500 companies run critical processes on mainframe systems without proper programmatic access. For these organizations, deploying an AI agent means choosing between an expensive modernization project or delaying implementation altogether.
WorkSpaces takes a different approach. This means giving agents the same desktop that human employees use. The agent authenticates through IAM, connects to the WorkSpaces instance with a unique signed URL, and interacts with the application by taking screenshots (computer vision), clicking, typing, and scrolling (computer input). The application is not aware that an agent is interacting with the application. There is no need to change anything regarding the software.
(Source: AWS News Blog Post)
Chris Noon, director of Nuvens Consulting, explained the value of regulated industries in his announcement:
WorkSpaces allows clients to provide AI agents with the same secure, managed desktop environment that their employees already use. There are no custom API integrations, full audit trails, or out-of-the-box enterprise-grade isolation. For a regulated industry, this is a baseline, not a nice-to-have.
MCP integration makes this framework independent. WorkSpaces exposes managed MCP endpoints. This means that any agent framework that speaks MCP can connect, such as LangChain, CrewAI, Strands Agent, etc. AWS demonstrated the ability to process prescription refill workflows within a sample pharmacy system using the Strands agent built on Amazon Bedrock. This means you can search patient records, look up medications, order them, and check refills, all without an API.
The security model inherits everything that companies already have in place in their human WorkSpaces environments. The agent runs within an isolated WorkSpaces instance rather than on your local machine or internal network. CloudTrail captures all activity for auditing. CloudWatch provides observability. AWS recommends that you give each agent a unique IAM ID to distinguish between agent actions and human activity. Desktop screen resolution, image format, and agent functionality (computer input, computer vision, and screenshot storage) are all configurable on a per-stack basis.
The question of cost is clearly met with skepticism. Reflex, an AI coding company, recently released a benchmark study showing that a vision agent consumes approximately 500,000 input tokens to complete a task that an API agent processes for 12,000 tokens. This is a 45x cost difference. Palash Awasthi, Head of Growth at Reflex, asserted:
As visual models improve, the error rate per screenshot decreases, but the number of screenshots required to reach relevant data does not decrease.
The vision agent also took 17 minutes, compared to 20 seconds for the API path. Awasthi acknowledged that better models would ultimately reduce costs, but argued that vision-based agents will always require more steps than API-based alternatives.
This tradeoff is exactly what AWS is advocating. Computer-based agents and APIs solve fundamentally different problems. If an API exists, agents must use it. However, the majority of enterprise software, traditional ERP systems, thick client applications, and proprietary tools do not have API access.
For these applications, an agent that is 45 times more expensive may still be cheaper than a multi-year modernization project. The question for each organization is whether the value of workflow automation is worth the token cost at a given scale. The ephemeral nature of cloud desktops helps control costs. Rather than maintaining an always-on infrastructure, organizations can launch WorkSpaces instances for specific tasks and shut them down when the agents are complete.
Microsoft is pursuing a similar approach with Windows 365 for AI agents, creating a parallel category of cloud desktop services where AI systems interact with software through UIs rather than APIs.
WorkSpaces Agent Access is available in preview in US East (N. Virginia, Ohio), US West (Oregon), Canada (Central), Europe (Frankfurt, Ireland, Paris, London), and Asia Pacific (Tokyo, Mumbai, Sydney, Seoul, Singapore). A GitHub repository with sample code is now available.
