Don't miss leaders from OpenAI, Chevron, Nvidia, Kaiser Permanente, and Capital One, only at VentureBeat Transform 2024. Gain key insights about GenAI and grow your network at this exclusive three-day event. Learn more
Today, London-based Synthesia, a startup that enables businesses to create professional-quality AI videos, announced a major update to its platform aimed at providing a comprehensive suite for accelerating video-first communications initiatives.
Officially named Synthesia 2.0, the update introduces several key features, including full-body avatars with a wide range of movements, interactive video experiences that allow users to create AI videos with elements they can interact with, like calendars and forms, and a new AI screen recorder that makes it easier for businesses to create how-to videos and other content for their employees.
This development follows on the heels of Synthesia's announcement of expressive avatars, though it's important to note that not all features will debut immediately: some will be released next month, while others will roll out over the coming months.
The next step in strengthening corporate communications
In 2017, a team of AI researchers and entrepreneurs from Stanford University, Technical University of Munich, and the University of Cambridge came together to found Synthesia. Their goal was simple: to help businesses quickly move from monotonous text-based content to more engaging and immersive video content. Over the years, they developed an end-to-end platform that allows businesses to create custom AI voices and avatars (they can also choose from existing ones) and combine them with pre-written or AI-generated scripts to generate AI videos.
Countdown to VB Transform 2024
Join enterprise leaders at our flagship AI event in San Francisco July 9-11. Network with your peers, explore the opportunities and challenges of generative AI, and learn how to integrate AI applications in your industry. Register now
Fast forward to today, and Synthesia is now used by over 55,000 companies, including Zoom, Dupont, Heineken, and Electrolux. The company has also significantly enhanced its AI avatars, making them more lifelike and emotive. Just a few weeks ago, it unveiled its new Express-1 model, which allows avatars to understand the context and sentiment of text and change tone and facial expressions to deliver speech.
With its latest update, the company continues to develop its avatars. Essentially, to enhance the storytelling aspect of digital characters, the company is expanding the avatars' range of motion. This gives the avatars more personality and allows them to tell compelling stories using the full range of body language available to humans, including the hands.
The improved avatars are the result of collecting data from thousands of people around the world and training several large-scale video- and audio-based models, said Dan-Vlad Cobasneanu, Synthesia's head of product marketing. He added that these avatars are fully controllable, allowing users to specify the avatar's appearance with images and videos, and animate it with skeletal sequences.
But that's only part of the Avatar upgrade.
Synthesia is also enhancing how users create their own AI avatars by allowing them to use a webcam or mobile camera with a natural background. This is especially useful when users want to appear more authentic, such as when doing tutorials, according to CEO Victor Riparbelli. The recorded personal avatars will have improved lip sync, sound more natural, and can translate speech into more than 30 languages.

Interactive AI Video
While improved avatars improve the way content is delivered, the new interactive video player built by Synthesia changes the way content is consumed. Users can integrate a variety of clickable hotspots into their content so end viewers can click to take action. For example, they can click on elements to fill out a form, open a calendar or quiz, or jump to just the part of the video they want to see.
While this feature appears to still be a few months away, the demo video showed that users can enable these clickable experiences simply by enabling interactivity and defining the flow they want the hotspots to take them to. The first feature to debut in the Interactive Experiences suite will be the ability to change the language of the video and what is displayed to your preferred language, the company said.
Notably, Synthesia is also adding an AI screen recorder. Initially, the feature will work like a regular screen recorder, capturing whatever is happening on the screen. Once you stop recording, the company's underlying model will generate a professional-quality AI video that includes the speaker's audio and a transcript of the audio. Users can then edit this to add their own avatars and auto-zoom effects, and highlight important actions. They can also edit the script to update the content as needed.

What else comes with Synthesia 2.0?
Notably, Synthesia 2.0 adds incremental improvements such as the ability to add Brand Kit (incorporating brand language and identity into videos) and the ability to bulk generate content via the company's AI-powered video assistant.
It also offers new collaboration capabilities that enable multiple users to work on video projects at the same time, and an improved one-click translation experience where users only create and maintain one version of a video, with translations happening and updating automatically.
It will be interesting to see how these new features drive adoption for Synthesia, which has focused on enterprise applications with a consent, moderation and collaboration-driven approach. Other players competing with the company in this space include Deepbrain AI, Rephrase and HeyGen.
Source link
