Microsoft Maia 200 AI chip could increase cloud GPU supply

Microsoft is likely to be the biggest consumer of the new Maia 200 AI accelerator, which could have a domino effect on cloud AI infrastructure, according to industry analysts.

Maia 200, launched this week, is part of a new focus in the technology industry on AI inference, which is part of a generative AI workflow that applies a large-scale trained language model to a set of data to generate an output. Nvidia last month launched six new chips, including Vera CPUs and Rubin GPUs designed for AI inference, and rack-scale Vera Rubin hardware and software packages supported by enterprise IT vendors such as Red Hat.

Microsoft’s Maia 200 rivals AWS’ Trainium and Google’s TPU as an AI accelerator. This is a chip built specifically for specific AI processing tasks, rather than a GPU running general AI models. Both the Maia 200 and Nvidia’s chips are built with AI inference in mind, but they take different approaches, said Mike Leone, an analyst at Omdia, a division of Informa TechTarget.

“Vera Rubin is built for types of inference that require more complex inference, such as when a single query triggers a large inference chain to answer a multi-step problem,” said Leone. “Maia is focused on processing millions of queries with Copilot and other more standard chatbots. The goal is not necessarily deep inference, but processing at scale with the lowest possible margins.”