Google DeepMind on Thursday announced its next generation of gaming-focused artificial intelligence agents, known as Scalable Instructable Multiworld Agents (SIMA 2). The upgraded system builds on the first version launched in March 2024 and brings significant improvements in inference, adaptability, and user interaction. According to the company, agents will continuously learn and improve their abilities through their own play.
How does SIMA 2 work?
among them announcementDeepMind highlighted that SIMA 2 can now reflect on its actions and reflect on the steps needed to complete a task. The agent is powered by Google’s Gemini model and is designed to follow instructions issued by humans, understand the questions asked, and plan its next actions based on the virtual environment it sees on screen.
The system receives visual input from the three-dimensional game world, along with user-defined goals such as “build a shelter” or “find the red house.” Then break that goal down into a series of smaller actions and execute them using controls similar to a keyboard and mouse.
What can I do with SIMA 2?
According to the company, one of the most important advances is: SIMA2 Increased ability to work with games you’ve never experienced before. DeepMind tested the agent in new environments such as Minedojo, a research adaptation of Minecraft, and ASKA, a Viking-themed survival game. In both cases, SIMA 2 achieved higher success rates than previous versions.
The system also handles multimodal prompts including sketches, emojis, and various languages. You can apply concepts learned in one game to another. For example, understanding mining in a sandbox world can help you understand harvesting in another survival environment.
How is SIMA 2 trained?
Google says: second generation agent It is trained using a combination of human demonstration data and automatically generated annotations from the Gemini model. As SIMA 2 learns new moves and skills in a fresh environment, that experience is captured and fed back into the training pipeline. DeepMind says this reduces reliance on human-labeled samples and allows the agent to improve itself over time.
What are the limitations of SIMA 2?
Despite advances, this system still has notable limitations. Memory of past interactions is limited, long-range inference that requires many steps is difficult, and precise low-level control similar to robot joint movements is not addressed by current frameworks.
The path to real-world robotics
DeepMind emphasizes that SIMA 2 is not intended to be a gaming assistant. Instead, the company sees three-dimensional game worlds as a useful testing ground for AI agents that can eventually control real-world robots. The broader goal is to develop General purpose machine Google emphasizes that it can follow natural language instructions and handle a variety of tasks in complex physical environments.
