To receive industry-leading AI updates and exclusive content, sign up for our daily and weekly newsletters. Learn more
As AI-generated content continues to gain more traction, startups developing technology for it are raising the bar with their products. Just a few weeks ago, RunwayML opened up access to a new, more realistic video generation model. Now, Haiper, a London-based AI video startup founded by former Google Deepmind researchers Yishu Miao and Ziyu Wang, is announcing its new visual foundation model, Haiper 1.5.
Available on the company's web and mobile platforms, Haiper 1.5 is an incremental update that lets users generate eight-second clips from text, images and video prompts — twice the length of Haiper's earlier model.
The company also announced a new upscaler feature that will allow users to enhance the quality of their content, and also plans to move into image generation.
The move comes just four months after Hyper came out of stealth. Though the company is still in its early stages and not as well-funded as other AI startups, it claims to have over 1.5 million users registered on its platform, which indicates its strong position. The company now aims to grow this user base with an expanded suite of AI products to take on Runway and other companies in the space.
“The video generation AI race isn't necessarily just about the power of the models, but also about replicating these models at scale. Our distributed data processing and scale model training allow us to continue training and iterating on powerful underlying models with this goal in mind. As highlighted by this update, we are making continued progress not only in producing more beautiful, longer videos, but also in building models that can reproduce imagery that we all truly recognize as the world around us,” Miao, who is also the company's CEO, told VentureBeat.
What does the Haiper AI Video Platform bring to the table?
Launched in March, Haiper follows in the footsteps of Runway, Pika and others to provide users with a comprehensive video generation platform powered by an in-house trained perception-based model. Essentially, it's incredibly easy to use: users simply enter text prompts describing what they can imagine, and the model generates content based on that. There are also prompts to adjust elements like characters, objects, backgrounds and artistic style.
Initially, Hyper animated text prompts or existing images into 2-4 second clips. While that feature served its purpose, the content length wasn't long enough to target a wider range of use cases, a concern the company often heard from creators. Now, with the launch of its latest model, the company is solving this problem by doubling the length of the generation to 8 seconds.
It also allows users to expand their previous 2- and 4-second generation to 8 seconds, in a similar way to what we've seen in other AI video tools, such as Luma's new Dream Machine model.
“Launch less than four months ago, we've been thrilled with the response to our video generation model. Our goal to continue pushing the boundaries of this technology has led to our latest eight-second model, doubling the length of video generation on the platform,” Miao said in a statement.
But that's not all.
Originally, Haiper produced just 2 seconds of high-definition video, with any longer clips produced in standard definition. The latest update has changed that, as it can now generate clips of any length in SD or HD quality.
There's also an integrated upscaler, allowing users to enhance all video generations to 1080p with one click, without disrupting their existing workflow. The tool also works with images and videos users already have; simply upload them to the upscaler to improve the quality.
In addition to the upscaler, Haiper is also adding a new image model to its platform, which allows users to generate images from text prompts and animate them using text-to-video functionality for flawless video results. According to Haiper, integrating image generation into its video generation pipeline will enable users to test, review, and revise content before moving on to the animation stage.
“At Haiper, we want to listen to our users and bring their ideas to life, rather than just iterating. The debut of our new upscaler and Text2Image tools is a testament to the fact that we are a video generation AI platform for the community, engaging with our users and actively improving,” Miao added.
Building AGI that perceives the world
Haiper's new models and updates look promising, especially in the samples the company shared, but they have yet to be tested by the wider community: When VentureBeat tried to access the tool on the company's website, the image models were unavailable, and the eight-second generation and upscaler were restricted to only users paying for the company's Pro plan ($24 per month, billed annually).
Miao said the company plans to make the eight-second videos more widely available in several ways, including a credit system and the image model will debut for free later this month, with the option to upgrade for faster speeds and more concurrent generation.
In terms of quality, the platform's 2-second videos seem more consistent than the longer videos, which are still hit and miss. The 4-second videos we generated were sometimes blurry, lacking (or overdoing) detail in subjects and objects, especially for content with a lot of movement.
However, these updates and more planned in the future are expected to improve the quality of Hyper's generation. The company plans to enhance the perception-based model's understanding of the world, essentially creating an AGI that can cover the tiniest visual aspects such as light, movement, texture, and interactions between objects, and reproduce the emotional and physical elements of reality to create content that is true to life.
“Each frame of video is packed with minute visual information. For AI to create realistic and visually stunning content, it needs to intrinsically understand the world and the physical phenomena behind it. An AI that can understand, interpret, and generate the complexities of video content will have deeper knowledge and perceptual capabilities, bringing it one step closer to AGI. Models with such capabilities could have far-reaching applications beyond content creation and storytelling, in areas such as robotics and transportation,” Miao explained.
It will be interesting to see how the company develops in this direction and how it stacks up against rivals like Runway, Pika, and OpenAI, who are still ahead in the AI video race.
Source link