How Google's Genie3 Changes AI Videos – and How to Make Your Own Interactive World

AI Video & Visuals


Google Deepmind Genie

Google

Important takeouts from Zdnet:

  • The world model helps to promote AI research, entertainment and more.
  • Google 3, Google Deepmind's World Model, and Genie 3 debuted on Tuesday.
  • Google Deepmind says that Genie 3 has a world of “understanding.”

Imagine exploring a virtual environment with no boundaries.

This is precisely what many high-tech developers are working to create via algorithms that can build and act on AI's “world model,” or representative models within the real world, mimicking the human brain's ability to predict the behavior of physical objects.

Also: Google's VEO3 can now create 8-second videos from a single image – how to try it

A world model like Google Deepmind's new Genie 3 can have a major impact on AI agents, robotics, entertainment, education, and many other areas.

Below is what the AI world model is, how it works, and why it is important.

What is the AI World Model?

As you can imagine the sunlight illuminating the equipment in your living room, or the effect of still stone dropping into a quiet pond on the surface of the water, the “world model” of AI can not only string words together, but also produce realistic images. It can make accurate predictions about the real world based on its ability to infer how the fundamental physical mechanisms of the world actually work.

This has particularly important implications for the field of AI-generated video. This is one thing as the basis for watching millions of videos of glass that fall on the floor and crush them, and using it to generate new videos of the same event. It's another for intuitive grasp of the physics of gravity, the distance that broken glass is scattered across carpet and tile floors, and the fact that human hands that carelessly touch one of those debris can lead to wounds and bleeding.

This has become the latter goal for major AI developers. The AI world model not only mimics scenarios, but can actually predict a virtually infinite number of new ones.

For example, Openai's Sora, released last February and an early example of the world model, shocked the AI community with her ability to simulate real physics, including light reflecting a pool of water on a simulated street.

Demon 3

The Genie 3 is another example of the power of the world model.

From a simple natural language prompt, Genie 3 can generate dynamic simulations of virtual environments that evolve and change according to user actions. (Its predecessor, Genie and Genie 2, debuted in February and December last year, respectively.)

Also: You can now turn Google Photos into a video clip – how is this

Unlike classic video games with clearly bounded virtual spaces, world models like Genie 3 can extend the simulated environment when interacting with users.

“We're not walking through pre-built simulations,” the narrator said in a demo video introducing the Genie 3.

https://www.youtube.com/watch?v=pdkhuknuqdg

Gienie 3 comes with what Google Deepmind calls “World Memory.” This allows the model to represent changes that persist over time in the simulated environment. For example, a demo video shows users painting the walls with paint rollers. The marks they made with rollers are still visible as they turn their backs and then look at the wall.

If you feel bored while exploring the simulated environment, you can shake things up by encouraging Genie 3 to trigger an event. “A man on a horse carrying a bag full of money is being chased by a Texas Ranger on a horse. All his hooves are kicking huge dust.”

“We look forward to seeing how Genie 3 can be used for next-generation gaming and entertainment,” the narrator said in the demo video.

Why is the world model so important?

As the narrator of Genie 3 Demo video suggests, the world model can have valuable applications that help to generate more realistic, dynamic and interactive entertainment.

For example, it can help the AI industry build embodied agents that can navigate and interact with the real world. (This was a challenge that the autonomous automotive industry has been trying to overcome since its inception, without being successful.)

Also: This new AI video editor is an all-in-one production service for filmmakers – how to try it

It can also help first responders prepare for real emergencies to simulate what Genie 3 Demo describes as “dangerous scenarios,” such as the recent natural disaster scene. Coupled with a virtual reality headset, immersing yourself in a world model can help first responders build muscle memory and are equipped to act calmly under obsessiveness.

Education may also benefit from the use of the world model, especially for students who are more welcoming of visual information.

Does the world model actually “understand” the real world?

The algorithm is trained on a large amount of real-world data and gradually improves its ability to make predictions. Ultimately – in the process that researchers still work to understand, they can become very proficient in this and, for all intents and purposes, appear to “understand” several aspects of the world, such as English syntax and the physics of human motion.

In a blog post, Google Deepmind defined it as “defining AI systems that can use world understanding to simulate that aspect, allowing agents to predict how their environment evolves and how actions will affect them.”

Also: This interactive AI video generator feels like you're stepping into a video game – how to try it

However, the use of the word “understanding” in this context is controversial. Some experts argue that because AI can only replicate patterns, they can never understand concepts in the way humans can, while others have taken the opposite view, claiming that perhaps human understanding is just a sophisticated kind of pattern recognition.

If you try to walk around every room in your house with a blindfold on it, you can probably do so without injuring yourself or breaking anything (assuming you've been living there for a while). Similarly, today's AI models allow us to explore potential spaces of information, at least to our humans, as we know the land rug.

Get the top stories of the morning in your inbox every day Tech Today newsletter.





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *