Microsoft Copilot Cowork tackles multi-step AI automation

Applications of AI


Microsoft launched Copilot Cowork within the Microsoft 365 Frontier program to enable autonomous multi-step task execution across applications such as Excel, Outlook, and SharePoint. The system is based on Anthropic’s Cowork technology and uses a multi-model approach that combines OpenAI’s GPT and Anthropic’s Claude to deliver more accurate research results.

New features are designed to autonomously handle long-running, multi-step workflows. Until now, Copilot has focused on generative tasks such as email drafts and document summaries. These are one-shot outputs that require humans to stitch together the workload to reap the benefits of automation. Cowork helps you to actually automate certain tasks. You describe your desired outcome, and the system creates a plan and executes it across your Microsoft 365 applications. There is no need for a human to direct each step.

The announcement was made today and is part of Wave 3 of Microsoft 365 Copilot, which Microsoft describes as a turning point toward embedded agent AI in the workplace. Copilot Cowork is currently available through the company’s Frontier program, which gives companies early access to cutting-edge capabilities. It could eventually become the basis for the new E7 tier of Microsoft 365, the top of the company’s subscription-based workplace suite.

Built on Anthropic’s Cowork

The technology behind the new features is well known, as is the naming scheme. Anthropic launched Cowork in January as an agent tool for broader knowledge work, built on the same principles as Claude Code but aimed at non-technical users. Plug-in support continued in February, further broadening its enterprise appeal. Microsoft then integrated the same technology platform directly into Microsoft 365 Copilot.

Within Microsoft 365, Copilot Cowork acts as an orchestrator, inferring across files and applications such as Excel, Outlook, Teams, and SharePoint. You can completely take over tasks like monthly budget reviews that typically require jumping back and forth between spreadsheets, emails, and documents. The system operates within the Work IQ framework, which is based on your organization’s data while monitoring security and governance boundaries. Human supervision is maintained throughout. Users can monitor progress and redirect the agent if it goes off track.

“Connect steps, coordinate tasks, and execute entire daily workflows,” says Barton Warner, senior vice president of enterprise technology at Capital Group, one of the first organizations to use the system.

multi-model researcher

Copilot Cowork also includes changes to the Researcher agent. It currently uses a “critique” layer that combines OpenAI’s GPT model with Anthropic’s Claude. One model produces an initial response. The other reviews accuracy and quality of citations. Microsoft said this increased Researcher’s score on the DRACO benchmark, an industry measure of the quality of deep research, by 13.8%. The roles can also be reversed, and a new “Model Council” feature allows users to compare outputs side by side.

But beyond benchmarking, new tools need to prove to users that they are more useful than Copilot, which they know and (in many cases) don’t really like. In any case, truer agent capabilities that take over mechanical tasks are appealing, and tools like OpenClaw demonstrate that autonomous actions can be diverse and scalable. The Frontier program is currently offering access to Cowork ahead of broader rollout. This is with the hope that the full release will not contain the same security flaws found in tools like OpenClaw, and will instead provide enterprise-grade protection.



Source link