Real life or AI? Try telling me which of these videos is real

AI Video & Visuals


Ask Google Gemini to generate videos using Veo 3

Mishaal Rahman / Android Authority

Having spent time with generative AI, I thought there was a fair idea about what to expect from VEO 3, Google's cutting edge AI video generator. But when I finally spun over $20 for a Google AI Pro subscription a few weeks ago, I was surprised that it was even better for my most optimistic expectations. Unlike early AI image generators that produce obvious variants such as extra fingers and absurd architectures, Google's VEO 3 can generate videos that look surprisingly similar to real-world equivalents.

In fact, some of Veo's videos can seem so persuasive on social media that they had to reconfirm whether they were AI-generated content or looking at stock clips. Naturally, that led to the question: how good is Veo 3, and can the average person even say that he is watching a video generated by AI? To investigate, I've put together the short quiz below using Clips Pits generated with six Veo against the actual video. Can you tell me the difference?

AI Generation Video Using VEO 3: Scary

Gemini Veo 2 for Android phones

Mishaal Rahman / Android Authority

The VEO 3's ability to generate highly compelling clips is impressive in its own right, but it takes it a step further. It may also generate synchronized audio or sound effects. This means that the results it produces appear to be indistinguishable from actual trading and untrained eyes.

Of course, if you look closely there are obvious signs pointing to the AI origins of synthetic videos, but you can expect those small flaws to disappear sooner than later. Since its debut in I/O, Google has sent a number of fixes to VEO 3. This includes recent ones that prevent text from appearing like glitchy subtitles.

You will need a Google AI Pro or Ultra subscription to generate videos using VEO 3. This will set you back down over $20 a month, so you don't talk anything about the eye-catching higher tier at $250 a month. Still, there are only limited credits generated per month.

The Google Veo 3 is expensive and very limited, but still very capable.

The list of VEO 3 restrictions doesn't end here. At this point, we can only generate very short videos. Each is within 8 seconds. That said, Google Flow, an experimental AI filmmaking tool, allows you to chain multiple Veo-generated clips to create longer videos. Apart from the length, another major limitation is that you can only generate 720p videos on your VEO 3.

VEO 3 costs Google a lot of money in terms of processing. I don't know the exact internal costs of Google, but I know that developers who are being charged for using VEO 3 via APIs. Each video with audio costs $0.75 to generate, while a silent clip costs $0.50 per second. This means that an 8-second video costs developers up to $6 per generation. Multiplying it with just a few clips reveals why Google limits how many generations it can be with a $20 Pro subscription. The cost of this technology is probably not trivial.

So is Veo 3 worth that prince's price tag? It brings us back to the original question: can we actually tell the difference between real-world videos and video generated in AI? Below, we have arranged six short clips. Let's see which ones you can find.

Video 1: Combine harvesters

Let's start with something simple. This is relatively easy to choose if you're looking at it. The AI-generated version doesn't replicate many of the real world details you'd expect from the real farming scene. The sky, farm machinery, and small background elements look a little too clean and even. But to be fair, I gave Veo 3 a rather short and descriptive prompt.

With that in mind, the Veo 3 actually did an amazing job. If you haven't seen the video side by side with real footage, it can easily pass for the real thing at a glance. What's even more impressive is that the VEO 3 was delivered on both sides, seeking a color scheme for a particular machine, also mentioning the brand name. This shows how good it is to follow the context and direction.

Video 2: Squirrel eating nuts

Another relatively simple thing. The Veo 3 version is particularly impressively close with subtle body movements and surprisingly compelling ambient sounds, but is lacking when placed next to actual stock footage. The AI squirrel is a little too clean and the background is too dark, but my prompt is responsible. But what's the most impressive part? We instructed VEO 3 to focus on squirrel fur with shallow depth of field.

I think what gives it is the lack of unpredictable reliability you get from a real animal. In the stock clip, the squirrel is fumbled with the nut, biting more than (literally) than biting, and there's a bit more character. Still, if you watch an AI clip alone, you probably won't be wondering.

Video 3: A busy night market in Thailand

VEO 3 shows off its strengths here, captivating the overall atmosphere – a sense of lively energy and movement. If you've never been to Thailand, both videos may seem equally convincing.

However, if you look closely, cracks begin to appear. The food stalls are too uniform and there is no visual disruption seen in real night markets. Vendors also seem to sell random, incongruent items that don't make much sense. And when you look at the hand movements of the vendors, you can see that they are rather unnatural. This is a sign of the classical teletail of generating AI, and Google's video generators are not immune to that problem.

Still, this is a difficult scene to come, and given the complexity, Veo 3's attempt is half decent.

Video 4: Hiker and rolling fog

This scene is perhaps the most memorable bunch. Without the clutter of urban elements and complex character interactions, Veo 3 can really shine. Dramatic lighting, scenic landscapes, and fog-like air effects will not sweat. Real-world clips also help to look impressive like something from a video game.

That makes this really difficult to guess. Need some hints? If you look closely at the hiker's left hand, you will notice subtle rendered hiccups that break the illusion.

Video 5: Goats' flocks

Another difficult thing. VEO 3 has impressive results here, and at first glance it's really hard to convey an AI-generated video apart from the real thing. The goat's pacing and movement are well-convincing.

I'm not sure if I can distinguish them, but knowing which one is generated by AI can choose subtle oddity. For example, the AI clip's ground is a little too flat. The goat's face and body are also strangely smooth, but the actual animals have dirt. Still, there is no obvious flaw. It's the sensation of the intestines.

How accurate can you find videos generated by AI?

Did you guess correctly?

72 votes

Some of the above clips were easier to detect than the others, but if you notice that even the obvious ones are making a second guess, you're not alone. If the video generated by AI is almost correct with lighting, camera angles and subject matter, it can be surprisingly difficult. Even if you look at images generated by hundreds or thousands of AI, you don't know if you've picked up many fakes without a direct comparison.

As technology gets cheaper, we can expect videos made using VEO 3 to become more common. Google is currently adding a small watermark in the lower right corner of every video generated for all AI, but if you didn't notice the above, that's because you've cropped it from all the clips. It took me a few minutes per video to do that. This means that we need to find new and more effective ways to deal with the imminent deluge of fake videos on the internet. I don't know what the solution is, but I hope that Google's AI Ethics team will do it.

Thank you for being part of our community. Please read our comment policy before posting.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *