Microsoft deploys MDASH for large-scale AI vulnerability research

AI News


Microsoft has introduced a new AI-driven vulnerability discovery system called MDASH, a multi-model agent security platform designed to automate code audits at scale across Windows and other Microsoft software environments. The system combines over 100 specialized AI agents that work together to scan, verify, discuss, and prove vulnerabilities across complex codebases.

This announcement marks a shift in AI-assisted cybersecurity from testing individual models to more integrated systems focused on coordinated agents, verification processes, and automated proof generation. Microsoft emphasizes that the overall framework surrounding a model is more important than a single model, especially with extensive proprietary codebases such as Windows, Hyper-V, and Azure.

According to Microsoft, MDASH achieved a score of 88.45% on the public CyberGym benchmark of 1,507 real-world vulnerabilities, beating the next highest entry by approximately 5 points. Internally, it is reported that 96% of past products have been recalled. clfs.sys Vulnerabilities reviewed by Microsoft Security Response Center and 100% recall of past vulnerabilities tcpip.sys case.

MDASH Benchmark

Source: Microsoft Blog

MDASH operates as a multi-stage pipeline rather than relying on a single model or prompt chain. Dedicated agents handle scanning, discussion, verification, deduplication, and exploitation independently. Microsoft says this architecture helps the system reason across multiple files, identify lifecycle and concurrency bugs, and verify whether a vulnerability is actually exploitable rather than just theoretical.

A large part of the announcement focused on the idea that future AI security tools will no longer rely on raw model functionality, but rather on orchestration systems built around the models. Microsoft explained that MDASH is designed to be model-agnostic, allowing teams to replace or upgrade models while keeping the surrounding validation, proof, and workflow infrastructure intact.

This release also sparked a discussion about the operational risks of large-scale agent security systems. In a thread on LinkedIn, Sandesh KS wrote:

The orchestration layer is really interesting and dangerous at the same time. When specialized agents begin coordinating identity systems, financial monitoring, and entire cloud infrastructures at the same time, the scope of a single misconfigured privilege boundary becomes enormous. Governance layers should be designed before the agent goes live, rather than being retrofitted after the first incident.

MDASH is currently being tested internally by the Microsoft security team and through a limited private preview with some customers. The company said organizations interested in testing the system can apply through the Microsoft Security preview program.





Source link