
Google has announced a revolutionary new AI model called Veo, which is responsible for generating videos tailored to users' creative visions. Google is also upgrading its image generation model, which is now in its third generation with Imagen 3.
Bard was one of the first to try Google's latest AI LLM. This version was first released about a year ago and has seen significant changes to the platform in recent months. One of his biggest changes was a complete name change, rebranding the user-facing AI tool to Gemini. This has now expanded across the Gemini Nano and Gemini Pro product lineup of current and upcoming devices.
Just before Bard was renamed Gemini, Google added the ability to request images through an AI conversation model. If you ask for an image of a cow on a boat, it will render exactly that image in whatever style it sees fit. This process utilized Imagen 2, the first version released to the public.
Google's Veo model
Today, Google is announcing two creative generation models: Veo and Imagen 3. Veo is the most exciting because the general public hasn't been able to try it yet. This model, like other modern models, is purpose-built for video generation that understands visual semantics and natural language. This approach introduced to video generation provides results that can be creatively tailored to suit specific styles.
Google says the Veo model can understand “cinematic terminology” in user prompts, such as aerial shots and time-lapse formats. At 1080p, Veo can generate videos longer than a minute, which is better than current models such as OpenAI's Sora, which can last up to 60 seconds.
Veo builds on years of work in generative video models, including Generative Query Network (GQN), DVD-GAN, Imagen-Video, Phenaki, WALT, VideoPoet, and Lumiere, and introduces new architectures, scaling laws, and other We combine technologies to improve quality and output resolution. .
Google is inviting creators and filmmakers to try Veo to create a model that can accommodate a variety of artistic styles and use cases.
imagen 3
The Imagen model has also been significantly updated. Imagen 3 is positioned as Google's “highest quality” text-to-image conversion model, and includes several improvements over the Imagen 2 model seen in Gemini and Bard.
Imagen 3 is said to bring a higher level of detail to images with fewer visual artifacts and impurities in the generated images. Upon request, images will be made more photorealistic and realistic.
Perhaps the biggest improvement is Imagen 3's ability to render text. This has become a comical weakness of text-to-image models like DALL-E and Adobe Firefly. Google is positioning the new model as a way to create personalized images with text, such as greeting cards or photos with messages. It remains to be seen how well text actually renders, but this is a promising improvement.
Both Veo and Imagen 3 will be available in private preview through Google Labs' VideoFX. VideoFX leverages SynthID to ensure that the content created is digitally watermarked and produced in a responsible manner.
Those interested in trying out the new model can sign up through Google's waiting list.
FTC: We use automated affiliate links that generate income. more.
