Google’s new Gemini Omni: Edit video AI just by chatting

Google has released a video AI that lets you edit clips by talking to them. No timeline, software, or experience required. Announced at Google I/O 2026 on May 19th, the new Gemini Omni model marks Google’s most aggressive move yet into the AI video creation race, pitting it against OpenAI’s Sora and Adobe Firefly. Here’s what it actually does and whether it’s as impressive as you think.

What is Google Gemini Omni AI Video Editor?

Most AI video tools work in one direction. Create a prompt and the AI will spit out a clip. Gemini Omni changes that dynamic. I accept any combination of inputs — text, images, audio, and existing video — can be used to build or modify scenes through back-and-forth conversation. Each instruction you give builds on the last, keeping characters, lighting, and objects consistent between edits.

Think of it more like a creative co-pilot than a video generator. Shoot a 15-second clip and tell Omni to change the background, swap out the characters’ outfits, and make the whole thing look like it was filmed in the 1970s. According to Google’s official announcement, “Your video is the starting point for something I could never have filmed on my own.”

The model is also significantly better at understanding real-world physics, such as gravity, motion, and the behavior of liquids. This means that AI-generated scenes should look much less “floating” than those produced by previous tools.

What you can do now with Gemini Omni Flash

The versions published today are: gemini omni flashwhich is the first public model built on the Omni architecture. The following is supported at startup:

Video generation from text — Scene description, get short clips
image animation — Give yourself a still photo and Omni will bring it to life
conversational editing — Change actions, add characters, and adjust your style through chat
Creating an explainer — Turn short prompts into visual breakdowns of complex ideas

There is one limitation worth noting at this time. That is, the audio output is only voice. Custom music and sound effects cannot yet be generated. Only audio narration.

All videos produced by Omni include SynthID digital watermarkGoogle’s way of making AI-generated content identifiable. This is an especially important safety layer. Deepfake-related concerns In 2025 and 2026, that voice will become even louder across the industry.

AI Avatars and Voice Clones: A Game Changer for Creators

One of the most impressive features is the personal avatar tool. You can use your voice to create a digital version of yourself (a clone that looks and sounds like you) and generate video content starring that avatar. Google says it has “clear policies to protect against” abuse, but the company has not yet released full details of those safeguards.

While this is a very useful idea for content creators and educators, it is also a very difficult idea for platform moderators. It is expected that this particular feature will soon attract increased scrutiny.

Gemini Omni vs. OpenAI Sora and Adobe: The AI race of 2026

Gemini Omni didn’t arrive empty-handed. This is part of Google’s largest single-day AI announcement to date. at the same time I/O keynote speech — Our team covered this across multiple areas — Google also announced Gemini 3.5 Flash, new inference model optimized for speed This is already rolling out across search and workspaces.

Meanwhile, the broader Gemini ecosystem is also expanding into hardware. Google’s first AI smart glasses We’re pushing the same conversational AI experience into wearable format. Gemini on Wear OS 7 Wear the model directly on your wrist.

The pattern is hard to miss. Google isn’t positioning Gemini Omni as a standalone creative toy. This is a core piece of infrastructure intended to run on every surface the company owns. Conversational video editing is just the beginning.

Google will also be in direct competition with OpenAI’s Sora as well as Adobe, which includes Firefly in its professional tools. The difference is that Google’s approach routes everything through the Gemini app, search, and YouTube ecosystem (a platform with billions of existing users), rather than asking people to adopt a new standalone product.

Gemini Omni Pricing and Availability (Updated May 2026)

The current situation is as follows.

platform	access	Fee
YouTube short	available now	free
gemini app	available now	AI Plus, Pro, or Ultra plan
Google flow	available now	AI Plus, Pro, or Ultra plan
Developer (API)	Coming soon	to be decided

Google has not yet confirmed the pricing for developer API access, but considering that gemini spark — Google’s lightweight companion model — is also being rolled out around the same time, and the picture of who will pay what for which features is still taking shape.

If you’re already a Gemini app subscriber, Omni Flash is available starting today. If you’re a YouTube shorts creator, you can start experimenting without spending a penny.

conclusion

Gemini Omni is a meaningful step forward. It’s not because AI video is new; conversational editing It’s completely different from anything that was available before. The ability to keep iterating through scenes through natural language while the model remembers what is already there removes one of the biggest points of friction in AI content creation: having to start over every time something changes.

Whether Google can maintain its lead as OpenAI and Adobe continue to ship is another question. But as of today, Omni Flash is real, it’s live, and it’s worth trying.

FAQ

teeth gemini omni free?

Partially. YouTube Shorts users will have free access starting this week. To fully experience the Gemini app and Google Flow, Google AI Plus, Pro, or Ultra subscription. Access to the developer API will begin in the coming weeks, but pricing has not yet been confirmed.

How is Gemini Omni different from Sora?

While Sora excels at generating entirely new video clips from a single text prompt, Gemini Omni is designed as a conversational editor. You can input existing videos and images and repeatedly change them (such as changing shirt color or background) through continuous chat commands.

How is Gemini Omni different from Veo?

Veo (Google’s previous video tool) was primarily a text-to-video generator. Gemini Omni goes even further. You can accept multiple input types at the same time, move conversations back and forth to adjust results, and keep elements like text and lighting consistent between edits. Google’s head of product for DeepMind described this as a fundamentally different architecture, rather than just an incremental update.