Microsoft introduces VASA-1, an Image-to-Video AI model that produces eerily realistic results

Microsoft has introduced a new artificial intelligence (AI) model that can generate hyper-realistic videos of human faces talking. His AI image-to-video model, called VASA-1, can generate videos from just his single photo and voice audio clip. According to the company, the videos created will have lip movements synchronized with the audio, making facial expressions and head movements look natural. In particular, the tech giant claims that it does not intend to release any products or APIs using his VASA-1 model, and that it will be used to create realistic virtual characters.

In a post on its research announcement page, Microsoft detailed how the AI model it is developing and highlighted its capabilities. The company claims that the VASA-1 model can produce 512 x 512p resolution videos at up to 40 FPS. This AI model is also said to support online video generation with negligible start-up delay. X (formerly Twitter) user Kaioken shared Video of AI model in action.

VASA-1's biggest accomplishment is its ability to render up to a minute of video (according to the demo) in high quality using a single still image, but the company also says it can generate lip movements that match audio files. We also emphasized the ability to The expression that accompanies it. The AI video generation model also provides fine-grained control to control various aspects of the video, such as primary gaze direction, head distance, and emotional offset. These attribute controls for disentangled appearance, 3D head pose, and facial dynamics help you modify the output precisely according to your instructions.

Additionally, the AI model was also able to generate videos using artistic photos, singing voices, and non-English voices. Microsoft researchers note that these functional capabilities are absent from the company's data, suggesting its self-learning ability.

While it is impressive that an AI model can generate hyper-realistic videos of real people with arbitrary audio, it also raises questions about its unethical use, especially in creating deepfakes. The company stressed that it does not intend to release the AI model to the public, but rather to use it to create virtual interactive characters.

Microsoft also said the technology can be used to improve counterfeit detection. “While recognizing the potential for abuse, it is essential to recognize that our technology has significant positive potential. The benefits range from providing companionship and therapeutic support to those in need, highlighting the importance of our research and other related explorations as we advance human well-being. With this goal, we are dedicated to developing AI responsibly,” the company added.

Affiliate links may be automatically generated. Please see our Ethics Statement for more information.

Source link

binance Anmeldung commented on Generative-AI-Jobs: Die 11 gefragtesten KI-Berufe: Can you be more specific about the content of your
Binance推荐码 commented on Cybersecurity Revolutionizing With AI and ML Tech: I don't think the title of your article matches th
b^onus de registro na binance commented on WestMetric Defends Controversial On-Page SEO Services for the Era of AI: Thanks for sharing. I read many of your blog posts
open a binance account commented on Will generative AI really supercharge phishing attacks?: Can you be more specific about the content of your
Gii thiu binance commented on How investing in ‘prompt engineering’ training can contribute to business success [Q&A]: Can you be more specific about the content of your

Microsoft introduces VASA-1, an Image-to-Video AI model that produces eerily realistic results

Leave a Reply

RECENT POSTS

Breakthrough AI Wellbeing Platform Movemove Selects IndyKite to Power Trusted AI

Build websites and complete web applications from a single prompt and host them in Canada

Rosie O’Donnell ‘horrified’ by Trump’s World Cup intervention and her AI video ‘failing’

Related Posts

Leave a Reply