Alibaba has released a WAN2.2 AI video generation model, combining mixing exper with video diffusion

AI Video & Visuals


Chinese technology giant Alibaba has released WAN2.2, the leading open source update for its AI video generation model. The new series, unveiled on July 28th, will directly challenge rivals such as Openai's Sora and Google's VEO. Introduce advanced mixing (MOE) architectures to improve video quality.

This release also includes a highly efficient 5B model that generates 720p video on a consumer-grade GPU. This move is part of Alibaba's strategy to lead the open source AI space by providing powerful and free tools to developers and researchers. This will take over the company's WAN 2.1 model, released earlier this year.

Under the hood: Moe Architecture and Consumer-Grade HD Video

The co-innovation of WAN2.2 is to introduce a mixed (MOE) architecture into the video diffusion model in the first video diffusion model in this field. This advanced design, widely validated in large language models, allows for a significant increase in the total model capacity without the computational costs being addressed during inference. The architecture is specifically tailored to the video generation process, separating complex removal tasks into specialized functions.

The MOE system uses a two-ecact design. “High Noise” experts handle the early stages of generation and focus on establishing the overall layout and movement of the video. As the process continues, “low noise” experts will take over, improving complex details and improving visual quality.

According to the project's technical documentation, this approach increases the model's total parameter count to 27 billion, but only actively activates at a specific step, maintaining the computational footprint of a much smaller model.

To complement this new architecture, WAN2.2 was trained with a much-expanded and refined dataset featuring 65.6% more images and 83.2% more videos than its predecessor, WAN2.1. The team focused on creating “Aesthetics at the film level” Use meticulously curated data with detailed labels for lighting, composition, contrast and tone.

This allows for more accurate and controllable generation, allowing users to create videos with customizable aesthetic preferences, as detailed in the official announcement.

WAN2.2 was comparing it to the main closed-source commercial model of Alibaba's Prodietory Wan-Bench 2.0.
WAN2.2 was comparing it to the main closed-source commercial model of Alibaba's Prodietory Wan-Bench 2.0.

Perhaps the most important part of the accessibility release is the new TI2V-5B model, a compact 500 million parameter version, a 5 billion parameter version designed for efficient deployment. This hybrid model natively supports both text-to-video video and inter-image tasks within a single integrated framework. Its efficiency is facilitated by a new high-compression vae (variational autoencoder) that achieves significant compression ratios, making high-resolution video generation possible on non-state hardware.

This breakthrough allows the TI2V-5B model to generate 720p video at 24fps on consumer-grade GPUs like the NVIDIA RTX 4090, requiring less than 24GB of VRAM. This brings advanced AI video tools to a much wider audience of developers, researchers and creators. To accelerate this adoption, the WAN2.2 model has already been integrated into popular community tools such as Comfyui and Hugging Face Diffusers.

Alibaba's decision to release WAN2.2 under the acceptable Apache 2.0 license is a direct strategic challenge to the closed proprietary model that dominates the high-end of the market. Companies like Openai and Google maintain their most advanced video models, Sora and Veo, behind Paywalls and APIs.

By offering powerful and free alternatives, Alibaba bets that the competition will escalate and that an open ecosystem will encourage faster innovation and wider adoption. This strategy reflects the confusion seen in AI image generation, which has made open source models a formidable competitor of closed systems.

Part of the attacks of the broader AI ecosystem

The launch of WAN2.2 is not an isolated event. This is the latest move in the Rapid Fire series of major AI releases from Alibaba, showing a comprehensive attack to establish itself as a leader in multiple AI domains. This surge in activity demonstrates a clear strategy for building a complete suite of open tools for developers.

In the previous week, the company unveiled its new flagship reasoning model, QWEN3-Thinking-2507, surpassing benchmarks in key industry. We also launched a powerful agent coding model QWEN3-CODER to automate software development tasks.

This strategic pivot was highlighted by a statement from Alibaba Cloud, which explained the decision to abandon the “hybrid thinking” mode of the previous model. The spokesman said, “After discussing with the community and reflecting on the issue, we decided to abandon the hybrid thinking mode. Here we will train our instructions and thought models individually to achieve the best possible quality.”

To showcase the real-life applications of AI, Alibaba also previewed the new “Quark AI” smart glasses. The wearable is equipped with the QWEN3 series. This is a move designed to build market trust by connecting software prowess to tangible consumer products.

Song Gang of Alibaba's Intelligent Information Business Group shares and states its vision for technology. “AI glasses become the most important form of wearable intelligence. They act as another pair of human eyes and ears.”

Timely launch amid benchmark skepticism

However, this aggressive push comes as we have grown industry skepticism about the reliability of AI benchmarks. A few days before the recent QWEN release, the study claims that Alibaba's old QWEN 2.5 model “cheated” key mathematical tests by remembering answers from contaminated training data.

The controversy highlights the systematic issue of “Teaching the test” In the leaderboard domination race. As AI strategist Nate Jones pointed out, “The moment we set leaderboard advantage as our goal, we run the risk of creating trivial exercises and outstanding models of flounder when faced with reality.” This sentiment is reflected by experts like Sara Hooker, head of Cohere Labs. “If leaderboards are important to the entire ecosystem, they're all set up to align the incentives.”

Alibaba has moved to a new model like QWEN3, but the allegations cast a shadow over the “benchmark war” that defines AI competition. The WAN 2.2 release focusing on tangible features and accessibility could be an attempt to shift narratives from leaderboard scores to real-world utilities and open innovation.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *