New Google DeepMind AI tech generates soundtracks for videos

AI Video & Visuals


Google's artificial intelligence lab, DeepMind, is working on new technology that can generate soundtracks and even dialogue to accompany videos. The lab shared progress on its video-to-audio (V2A) technology project that can be combined with video creation tools like Google Veo and OpenAI's Sora. In a blog post, the DeepMind team explains that the system can understand raw pixels and combine that information with text prompts to create sound effects that sync up with what's happening on the screen. However, the tool can also be used to create soundtracks for traditional footage, such as silent movies and other videos without audio.

DeepMind researchers trained the technology on videos, audio, and annotations that included detailed AI-generated sound descriptions and transcripts of conversations, so the technology learned to associate specific sounds with visual scenes. TechCrunch The DeepMind team isn't the first to release an AI tool that can generate sound effects (ElevenLabs also recently did so), and they say they won't be the last. “Our work goes beyond existing video-to-audio solutions because it can understand raw pixels and adding text prompts is optional,” the team wrote.

The text prompt is optional, but you can use it to shape and refine your final product to make it as accurate and realistic as possible. For example, enter positive prompts to guide the output towards creating the sounds you want, and negative prompts to avoid sounds you don't want. In the sample below, the team used the prompts “movie, thriller, horror movie, music, tension, atmosphere, footsteps on concrete.”

The researchers acknowledge that they are still trying to address existing limitations of V2A technology, such as the degradation of the output audio quality when the source video is distorted. They are also working on improving the lip sync of the generated dialogue. Moreover, they vow to conduct “rigorous safety evaluations and testing” before releasing the technology to the world.

This article contains affiliate links, if you click on such links and make a purchase we may earn a commission.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *