Vera Rubin: Nvidia is just planning its future in AI

las vegas
—

Nvidia provided a detailed look at Vera Rubin, its new computing platform for AI data centers. This release could have a major impact on the future of AI, given the industry's heavy reliance on the company's technology.

Nvidia previously announced some details about Vera Rubin, but on Monday at the CES technology conference in Las Vegas it explained how the system works and revealed when it will be available. Vera Rubin is currently in production, with the first products running on it expected to arrive in late 2026.

Nvidia has become the poster child for the AI boom, briefly becoming the world's first $5 trillion company last year as its AI chips and platforms become more widespread. But the company is also battling concerns of an AI bubble amid increased competition and a move by tech companies to make their own AI chips to reduce their dependence on Nvidia.

Nvidia CEO Jensen Huang, wearing his signature leather jacket, addressed the question of where AI funding will come from at the heart of the bubble debate in his opening remarks on stage at a theater inside Fontainebleau Las Vegas. He said companies are moving research and development budgets from classic computing methods to artificial intelligence.

“People ask where the money comes from. That's where the money comes from,” he said.

The Vera Rubin platform is Nvidia's attempt to position itself as the answer to the computing challenges posed by increasingly demanding AI models, including whether existing infrastructure can handle increasingly complex AI queries. The company claims in a press release that its upcoming AI server rack, called the Vera Rubin NVL72, will “deliver more bandwidth than the entire Internet.”

Nvidia says it has leveraged Vera Rubin to develop a new type of storage system that enables AI models to handle more complex and context-sensitive requests more quickly and competently. As companies like Google, OpenAI, and Anthropic move from offering simple chatbots to providing full-fledged AI helpers, the existing types of storage and memory used in traditional computers, and even the graphics processing units that power data centers, will no longer be enough.

Mr. Huang is We will move from chatbot to agent on Monday. In a video demonstration, someone built A friendly-looking tabletop robot becomes a personal assistant by connecting multiple AI models running on Nvidia's DGX Spark desktop computers. The robots were able to do things like recount users' accounts. You can also make a to-do list and ask your dog to get off the couch.

Huang said that while creating such an assistant was unimaginable a few years ago, it's “absolutely easy” now that developers can rely on large language models rather than traditional programming tools to build apps and services.

In other words, Nvidia argues that as AI becomes more sophisticated and “reasons” about these multi-step tasks, the old ways of doing things will no longer work.

“The bottleneck is moving from compute to context management,” Dion Harris, Nvidia's senior director of high-performance computing and AI hyperscale solutions, said on a call with reporters ahead of the press conference.

“Storage can no longer be an afterthought,” he added.

Nvidia also just signed a licensing deal with a company called Groq, which specializes in inference, ahead of CES, another sign that the company is investing heavily in the AI space.

“Instead of a one-shot answer, inference is now a thought process,” Huang said, referring to the process by which AI models “think” and “reason” through answers to accomplish tasks.

Nvidia said in a press release that all major cloud providers will be the first to adopt Vera Rubin, including Microsoft, Amazon Web Services, Google Cloud, and CoreWeave. Computing companies like Dell and Cisco are expected to incorporate new chips into their data centers, and AI labs like OpenAI, Anthropic, Meta, and xAI may employ new technologies for training and providing more sophisticated answers to queries.

Building on the vision it laid out at October's GTC conference, Nvidia also deepened its commitment to self-driving cars with a new model called Alpamayo and “physical AI,” the type of AI that powers robots and other real-world machines.

But NVIDIA's progress and adoption also mean it carries the burden of consistently exceeding Wall Street's lofty expectations and allaying concerns that spending on AI infrastructure far exceeds tangible demand.

Companies like Meta, Microsoft and Amazon are making tens of billions of dollars in capital investments this year alone, and McKinsey & Company predicts that companies will invest nearly $7 trillion in data center infrastructure around the world by 2030. And much of the support pouring into AI appears to be tied to a relatively small group of companies that move money and technology back and forth in so-called “circular financing.”

Google and OpenAI are also focusing on developing their own chips, allowing them to better tailor their hardware to the specific needs of their models. Nvidia also faces increased competition from AMD, and chipmaker Qualcomm also recently announced its entry into the data center business.

“No one wants to be at the mercy of Nvidia,” Ben Ballinger, global head of technology research at investment firm Quilter Cheviot, said in a previous CNN interview when asked about the possibility of other companies like Google challenging Nvidia with AI chips. “They're trying to diversify their chip footprint.”

Source link