Google’s new Gemini Omni AI video model can do crazy things

Google’s new Gemini Omni artificial intelligence (AI) model can do some amazing things. The key promise of this model is to create everything out of everything.

Google says the new Gemini Omni model can “create anything from any input,” including audio, video, photos, and text. The model starts with video generation, which users can then edit via text conversations with Gemini. This first model, Gemini Omni Flash, is currently available in the Gemini app, Google Flow, and YouTube Shorts.

As Google explains, it’s easy to edit AI-generated videos using text. The model also promises consistency after editing, including characters, allowing Omni to remember what was shown in the previous scene.

Prompt: Create a sculpture from foam.

The company even promises that Gemini Omni can leverage its “intuitive understanding of physics” to effectively “bridge the gap from photorealism to meaningful storytelling.”

Prompt: Marbles rolling at high speed on a chain reaction style track, continuous smooth shots.

Users are already achieving great results with Gemini Omni. For example, former Google product manager Bilawal Sidhu gave Gemini Omni a photo of a drone trajectory and had the AI generate a POV of the drone.

We gave Google Omni a sketched camera path and asked it to generate a POV of the drone. pic.twitter.com/cQZFMtOkEi

— Bilawalsidhu (@bilawalsidhu) May 26, 2026

The Verge‘s Allison Johnson called Omni “wild” and let the AI bring her child’s stuffed animal “Buddy” to life. Buddy went on exciting AI adventures, including whitewater rafting and snowboarding.

“The results were very mixed and perplexing. Some were very good. Much more consistent and true to my instructions than when I was testing Veo five months ago,” Johnson wrote. “But even in the best clips Omni created for me, certain AI jump scares still remain, like when Buddy suddenly turns around while skydiving.”

Prompt: Convert this to realistic footage. Use drawings only as movement guides and do not display drawings in the final video.

Omni’s biggest claim to fame, as tested by Johnson, is its ability to combine AI-generated video with a variety of input media, going from technically brilliant to potentially dangerous. One of her deepfakes even convinced her husband that “I’m basically the guy he saw in real life” every day for the past 10 years”

Whether this is a wonderful thing or a horrible thing depends on who you ask.

“I’m sure I’m not the only one who thinks this has no reason to exist,” wrote near_photography in the thread in response to Johnson’s post above. “There is no net benefit to society from this feature.”

Prompt: Apply poses and motions from the input video to the specified character from this image. Apply the image reference’s style to the new video

As Google points out, all videos generated using Omni include that “imperceptible SynthID watermark,” which allows users to easily tell if something was created with Google AI in Gemini, Gemini in Chrome, and Google Search. But what if someone isn’t using those platforms?

For example, Google is bringing this technology directly to YouTube Shorts and YouTube Create, but it’s impossible to predict how users will use it there.

Image credits: google

Source link