Nvidia scientists expect “everything that moves will eventually be autonomous” • The Register

AI Video & Visuals


video GenAI Summit 2024 kicked off on Wednesday at the Palace of Fine Arts in San Francisco, California, leaving some confused people there to hear about artificial intelligence.

As the event began around 9am, many attendees were waiting in disorganized lines outside the venue while staff scrambled to find badges. After identifying myself as a member of the media, I was let in without any ID or identity checks. Everyone was eager to get to the show. Inside, entry to the VIP-only AGI keynote was similarly lax.

Jim Fan, a senior research scientist at Nvidia and leader of the company's AI Agents Initiative, kicked off the show by looking back at the history of artificial intelligence, starting with Claude Shannon's chess machine Endgame, noting various milestones on the road to the “agent era.”

Starting at the end of Fan's presentation, the agent era is the direction AI technology is heading, towards the development of software agents that orchestrate how underlying models interact with other models and systems.

Fan argues that the “agent” era is the next technological step after the “generative,” “neural,” and “classical” eras of AI.

“I believe in a future where everything that moves will eventually be autonomous,” Fan said, without thinking about the potential implications.

Fan is bringing that vision to life through his work at Nvidia's GEAR Lab, which stands for Generalist Embodied Agent Research.

Slides from Jim Fan's GenAI Summit 2024 presentation – click to enlarge

Fan explained that a generalist agent needs to be able to survive, move and explore in an infinite world – it needs to have broad knowledge of that world and be able to perform almost any task.

“First, the environment needs to be open-ended enough, because the capabilities of the agent are ultimately capped by the complexity of the environment,” Fan says, “and the Earth we live on is a perfect example. The Earth is so complex that billions of years of algorithms called natural evolution have produced every single human being in this room.”

“You also need a huge amount of data because you can't explore from scratch,” Fan said. “You need common sense to bootstrap the learning.”

And you need a strong enough underlying model to learn from all these sources, he said. “And that's where this idea led us to Minecraft,” Fan said.

Fan believes that through Minecraft and related projects such as MineDojo, which consists of a simulator, database and agents; Voyager, a lifelong learning agent for Minecraft; Eureka, an agent for training robots; MetaMorph and Isaac Sim, engineers will be able to train basic agents to perform a variety of useful tasks.

Minecraft can be used as a simulator to teach agents how to perform specific tasks, and Isaac Sim makes that training incredibly fast.

“Isaac Sim's greatest strength is that it can run physics simulations more than 1,000 times faster than real time,” said Fan.

In other words, the path from chatbot to robot that can perform useful tasks in the real world is greatly shortened by simulation tools that can transform years of training into days. Indeed, in demonstrations of teaching a robotic hand to spin a pen on its fingers, the software can outperform most human pen spinners, provided the hardware is up to the task.

“There really aren't any five-finger hardware hacks in the world that have this level of power and agility to spin a pen,” Fan said, “so we're still waiting for hardware providers to catch up with Eureka.”

But for some applications, such as teaching a robotic dog to walk and balance on a transformable yoga ball, the base seems promising.

“We think training the foundational agent is very similar to ChatGPT,” Fan says. “Any language task can be expressed as text input and text output, and ChatGPT just scales up to large amounts of text and trains it. Again, it's very similar, but the foundational agent takes the reification specification and linguistic instructions as prompts and outputs actions.”

“The Foundational Agent is the next chapter for GEAR Labs.”

The robots are coming.®



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *