Google leverages YouTube's large video library, one of its most valuable data assets, to train the most advanced AI models, such as Gemini and The New Veo 3 video generator. CNBC I'm reporting.
YouTube offers Google an unparalleled repository of real human-generated content, with over 500 hours of videos uploaded every minute. However, this use of its own content has sparked both praise for its technical ingenuity and its scrutiny of privacy and data ethics.
The company has officially confirmed that it is using this video Trove to enhance the performance of its Gemini AI Language Model and VEO 3. This is the cutting edge AI that generates realistic videos using synced audio from a text prompt.
At the Cannes Lions Advertising Festival, YouTube CEO Neal Mohan revealed that VEO 3 will soon be integrated into YouTube shorts, embedding AI further into the heart of user-generated content. Created by Google DeepMind, VEO 3 uses advanced training from video and audio datasets to create dynamic 8-second clips with dialogs, sound effects and visuals.
Google has long been modest about how AI models are trained, but it openly acknowledges that YouTube videos form an important part of the AI training pipeline. This means that public content uploaded by millions of users over the years is likely to help AI learn to recognize human gestures, environmental cues, spoken language, and video pacing.
The advantages of this data give Google a great advantage in developing more nuanced and realistic AI models, but raises serious concerns about consent, copyright, and how user data will be reused.
YouTube's terms of use allow the platform to use uploaded content for product improvements, but users may not have expected to use the video to train AI that can generate synthetic voices, faces, and even events.
While Google promotes VEO 3 as a tool for creativity, marketing and storytelling, others are concerned about the misuse of AI and misinformation. Reports from research outlets like Time show how VEO 3 can create realistic, fictional videos, such as protest scenes, fake political speeches, or doctor news segments.
Even using digital watermark tools like SynthID, critics argue that safeguards may still be in development and not sufficient to prevent abuse.