Google’s new chip takes on Nvidia with AI inference

AI For Business


Google unveiled its latest generation of AI chips Wednesday, splitting its line into two areas of focus for the first time: training and inference.

Google’s Tensor Processing Units (TPUs) have grown into reliable rivals to Nvidia’s chips, which remains the dominant silicon across the AI ​​industry. Anthropic is a big TPU customer, and Apple has been using its chips to train AI models.

As the AI ​​battleground shifts to inference (the process of actually running a model after it’s deployed), Google is responding accordingly.

Google’s new TPU 8t is designed for training the largest frontier AI models, and the TPU 8i is built for inference. Google says both chips will be available later this year.

The split signals changes occurring across the industry. As the quality of models improves and the gap between leading labs narrows, there is a focus on agents and applications that run on the models and require more computing power. It’s about moving the economic focus of AI to the inference layer above the stack.

Nvidia is also preparing for a surge in inference. The company signed a $20 billion licensing deal with inference chip maker Groq late last year, and last month debuted a new chip designed to speed up inference.

Google says both new chips are a significant improvement over the 7th generation Ironwood TPUs released last year. Its new 8i inference chip makes major advances in high-bandwidth memory (HBM). Google says this solves the “memory wall,” the gap between the speed at which a processor can perform calculations and the speed at which it can access the data it needs. That’s important when running agents.

From answering questions to action

Google Cloud CEO Thomas Kurian told reporters on Monday that the decision to develop two new chips was a “natural evolution.”

Kurian also said the new chip was designed to be power efficient, “because we felt that as people continue to scale both training and inference, power efficiency becomes a constraint.”

This is also Google’s bet that agents will be the next big leap in AI.

“AI is evolving from answering questions to inferring and taking action,” Google’s head of infrastructure Amin Vahdat and Mark Lohmeyer said in a blog post announcing the new chip.

Google, Amazon, and Microsoft are all competing to develop custom silicon that could reduce their dependence on Nvidia. At the same time, these companies rely on Nvidia to train their models and lease Nvidia chips to customers through their data centers.

Google uses its own TPU chips to train its Gemini models, but sells access to Nvidia’s chips through Google Cloud. Google has announced that it will provide customers with access to Nvidia’s next-generation Vera Rubin GPUs later this year.

Google has spent more than a decade developing its own silicon, but in recent years it has ramped up internal efforts to attract new customers. For example, we’ve opened up support for tools like PyTorch that make it easier for enterprises to adopt TPUs.

This could allow Google to chip away at Nvidia’s advantage while significantly increasing its revenue. Morgan Stanley said in a December note that selling 500,000 TPU chips could add about $13 billion in revenue to Google’s balance sheet in 2027.

Have something to share? To contact this reporter via email, please specify the following address: hlangley@businessinsider.com Or text us at 628-228-1836. Use a personal email address and non-work device. Here’s a guide to sharing your information securely.