15 Amazing Interactive Worlds Google's Genie 3 AI can be created

AI Video & Visuals


Collage of various landscapes and scenes, including volcanoes, jellyfish, mountains, lakes, lakes, city streets, floating islands, winged people, sunset roads, boat rides, and more.

Google has announced the Genie 3, a new universal AI-equipped world model that can generate diverse, optically realistic and interactive environments.

Given a text prompt, Genie 3 can create a dynamic world. Users can explore in real time with resolutions of up to 720p and 24 frames. 720pp24 Exploration may not be impressive at first, but Genie 3 represents a significant advancement as Google's first world model that allows for complex, real-time interactions.

In contrast, the Genie 2 tops its resolution at 360p, providing users with minimal movement within the AI-generated world. Users can run small sets of actions on Genie 2 for about 10-20 seconds. However, Genie 3 allows you to navigate the world for a few minutes and even interact with objects from around the world.

A comparison table of four AI models, Gamengen, Genie 2, Veo, and Genie 3, shows differences in resolution, domain, control, interaction horizontal, and latency. The Genie 3 has a resolution of 720p and is excellent for multiple categories.

It's also useful to consider VEO 3, Google's latest AI video generator. This generic AI video model represents a greater advance than the VEO 2, which can achieve 4K resolution output. However, there are significant limitations. The clips are less than 10 seconds, and interaction is limited to video output controls. It is also worth noting that the Genie 1 was released a year and a half ago. The advancements made by Google's Deepmind researchers are surprising, if not a little scary.

Genie 3 offers real-world physics modeling, including water and lighting, the ability to simulate plant and animal behavior, fully modeled characters, and even real-world locations and even past times.

“Achieving high levels of controllability and real-time interactivity in Genie 3 required a significant technical breakthrough. During the automatic regression generation of each frame, the model must consider the trajectories that were generated before they grew over time,” explains Google.

“For example, if a user is reconsidering a location after one minute, the model must refer to relevant information from one minute before the time. To achieve real-time interactivity, this calculation must be on arrival, a few times per second, depending on new user input.”

Google points out that it is also extremely difficult to maintain consistency over all periods with a fundamental world model, as seemingly minor inaccuracies are quickly snowballed. The system has approximately one minute of visual memory. If the user navigates away from the object and then returns, the object must remain in its original location. It's an important achievement and is completely unprecedented for Google.

Google acknowledges that there are inaccurate limitations of action space, challenges due to multi-agent interactions in the generated world, text rendering (a common problem with generated AI), and geographical modeling of real locations. Nevertheless, the basic techniques on display here are worth noting.

Genie 3 is currently available to selected scholars and researchers, but Google is investigating how to bring Genie 3 to additional testers right away.


Image credits: Google. The complete prompt used to generate all 15 examples above is explained in detail in Google's Genie 3 blog post.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *