At Google I/O 2026, we announced our latest models, the Gemini Omni and the Gemini 3.5 family of models.
Gemini Omni is a new model that lets you create anything from any input, including video. Omni allows you to combine images, audio, video, and text as input to produce high-quality videos based on Gemini’s real-world knowledge. You can also easily edit videos through conversations.
And Gemini 3.5 is the latest model family that combines frontier intelligence and action. This represents a major advance in building more capable and intelligent agents. Start the series by releasing 3.5 Flash. It provides state-of-the-art performance for agents and coding, excelling at complex long-term tasks with real-world utility.
To help you understand Gemini Omni and Gemini 3.5 Flash more clearly, here are nine demos that demonstrate these features.
gemini omni
Edit videos through conversations. One of the features that makes Omni special is the ability to easily edit videos using natural language. Every instruction builds on the last one. Characters are consistent, physics is maintained, and scenes remember what came before. In other words, you can change the world around you. Change certain things or change everything. Your video will be the starting point for something you could never have filmed on your own.
