AI Week 25.09.27 Review

Large furry animals with flames coming out of the ground may have the wrong content produced by the AI. — Figure 1. Still from the WAN 2.5 demo video generated by Higgsfield. WAN 2.5 provides synchronized audio video generation **1080p 60fps** Lip syx speeches from movie videos, text or paired audio.

Combined with Apsara Cloud and AI Conference on Alibaba This week, Alibaba's Qwen team provided some great new AI model releases. This week, the gust of the Qwen release was released.Qwen goes to the max.“Alex Volkov of Thursdai called it another thing qwen-mas. The highlights of the AI model release are as follows:

The Qwen team has released its latest flagship AI model, the QWEN3-MAX It has over a trillion parameters, and is included in both the directed and thought versions, achieving near-stopping SOTA performance state with benchmarks, and is especially good for code generation and agent calls. Qwen-Max was trained with 36 trillion tokens before training. Available at Alibaba Cloud Model Studio.

Qwen has introduced QWEN3-VLQWEN3-VL-235B-A22B A new suite of multimodal models, both directive and thought models, highlighted by the thinking model. QWEN3-VL model Focuses on image text tasks and visual reasoning. QWEN3-VL-235B-A22B-THINKINKINKING can achieve SOTA results in visual inference tasks and outperform GPT-5.

Qwen released QWEN-IMAGE-EDIT-2509, An upgrade to the AI editing model that provides “multi-image editing that is perfect for pixels for creators.” for example:

Drag “Person + Product” or “Person + Scene.” It will blend like a magical blend. I don't have a Franken image anymore. ”

You can access Qwen-image-edit Hagging face, ModelScopeand Qwen Chat.

QWEN Team Open Source QWEN3-OMNI-30B-A3B A native omnimodal AI model with text, images, audio, video input and text, and streaming audio output. The model comes in three variations: direction, thought and caption. QWEN3-OMNI GITHUB REPO Share a Technical ReportLive access is on Hugging my face.

Qwen Team Released QWEN3-TTS-Flash, SOTA A multi-timbred, multi-lingual synthesis model that delivers highly expressive audio output, with the quality and accuracy of the sound quality of multiple languages. Please try it Hagging face space.

But wait, there's more!

Alibaba is open sourced wan-animate As part of WAN 2.2, enable character animation with motion transfer and lip sync. The Tongyi Lab behind Wan-Animate has been produced Technical paper and Published demo Animate the performer's video into target text. WAN 2.2 Animate 14B model is Available on Huggingface,

The content generated by AI associated with someone wearing the suit may be incorrect. — Figure 2. WAN-Animate allows you to replace characters in your video with selected animated characters, save expressions and movements, and replicate your original scenes seamlessly.

Alibaba has released WAN 2.5 previewwith synchronized A/V generation 1080p 60fps Movie video, max 10 seconds Clips with lip sync speeches from text or paired audio. It includes product-to-video, draw-to-video, and Lipsync studio. WAN 2.5 dropped to Higgsfield And it's Available on FAL and Qwen wan site. This is a big step in video production quality and sound.

Google has released the updated Gemini 2.5 Flash and Flash Lite, that These AI models have been significantly improved in intelligence, cost-effectiveness and speed. Artificial analysis reports both Gemini 2.5 flash and 2.5 flashlight, as well as thought and non-thinking versions. A significant improvement in overall intelligence benchmark. Gemini 2.5 Flash also improved the use and coding of the agent tools, jumping from 49% to 54% on SWE bench validation.

White background AI may have incorrect graphs of information about the content generated. — Figure 3. The upgraded Gemini 2.5 flash and flashlight significantly improve the score and ranking of the Artificial Analytic Intelligence Index.

The updated Gemini 2.5 Flash and Flash-Lite 2509 versions are faster and more cost-effective by increasing token efficiency while enhancing quality. Flashlight cuts 50% of tokens used in artificial analysis benchmarks. The Gemini 2.5 flashlight is now the fastest and unique model.. The cost of Gemini 2.5 flash and flashlight has not changed.

Deepseek has released DeepSeek-V3.1-Terminusupdate DeepSeek-V3.1 More stable output, better language consistency (less English and Chinese confusion), and agent performance is strong. Improved Agent Tool Usage Benchmark BrowseComp has gone from 30% to 38.5%, and TerminalBench has gone from 31.3% to 36.7%. Models and notes will be posted Hugging my face.

Kling has released the Kling 2.5 Turbo, Upgrades the video AI model with improved video quality and 30% lower price than the previous 2.1 model. Kling 2.5 Turbo has improved fast compliance, fluidity and stable dynamic scenes, and video consistency. Kling's price, speed and quality position it well In this highly competitive area.

Suno has released the Suno V5the company's most advanced AI music generation model features high fidelity audio, more natural vocals, and improved configuration. Early reviews should be aware of the technical benefits of V5 above V4.5 And that Incredibly good vocalshowever, Verge calls it “soulless.” For now, the Suno V5 pays only for users.

The Suno V5 release will take place in a few days Record labels accuses Sno of illegal “stream ripping” lawsuit With model training.

Liquid AI fired the liquid Nanos family,The parameters designed for agent tasks on edge devices consist of small task-specific frontier AI models ranging from 350m to 2.5b. The task-specific models are: LFM2-Extraction For extracting data from unstructured text. LFM2‑350M ‑ ENJP ‑ MT Two-way English and Japanese translations. LFM2‑1.2B -rag Questions for questions in the rug pipeline. LFM2‑1.2B‑Tool For agent tool calls; LFM2‑350M square To solve mathematics problems. The model can be run on the Liquid Edge AI Platform (LEAP) and is available via Hugging my face.

Xiaomi has released Mimo-Audio-7Ban open source 7B audio output model that supports thinking and generalization of few shots in audio tasks. MIMO Technical Report:MIMO Audio: Audio Language Models are a small number of shot learners” It explains how to pre-trade MIMO with over 100 million hours of audio, allowing many audio tasks to be performed as a small number of shot learning tasks. MIMO is available on Hagging face.

Moondream has released a preview of Moondream 3, 9B MOE VLM with 2B active parameters targeting frontier-level visual inference. Moondream 3 is a hybrid inference model It focuses on visually grounded reasoning.

IBM has released Granite Docling-258Ma small multimodal OCR model built for efficient (image text to text) document conversion. This is the successor no way It is compatible with docking tools and typical document conversion workflows.

Google Deepmind has launched Gemini Robotics 1.5bring AI agents into the real world with an upgraded robot thinking model: Gemini Robotics 1.5, Vision-Language-active (VLA) model, VLM-based Gemini Robotics-ER 1.5. These models allow the robot to plan multi-step tasksused in inference and web search to adapt actions to external sources. Access is via the Gemini API and IN Google AI Studio.

Openai has introduced ChatGpt PulseA proactive daily update experience with ChatGPT, which performs asynchronous research and offers personalized briefings each morning. Pulse can connect to Gmail and other personal data sources for context and is curated by user feedback. It is unfolded in a preview for Pro Mobile users. Feedback on this active personal support.

Meta AI has introduced a vibe feed for video generated in AI With the Meta AI app. Vibes is a stream that is easy to remix short AI videos From creators and communities where users can post their own AI clips and remix content with visuals and music. Users can share AI videos Direct to atmosphere, DM, Instagram, Facebook reels. It is the mainstream creation of AI-first in the meta ecosystem.

Google has improved the Gemini Live API To support more natural conversations and use of agents for external data and services.

YouTube Music is testing AI Radio “Host” It acts as an AI commentator for mixes and radio stations, sharing trivia and background on the track. ;This test is limited to some US users. YouTube has also expanded AI's “jump-to functionality” to televisions and consoles.

Microsoft has added the Humanity model along with Openai models used in Microsoft Copilot Studio. Copilot Studio will continue to use Openai as the default, but you can also build a Copilot AI agent by choosing the human models Claude Sonnet 4 and Claude Opus 4.1.

Meta Release Code World Model (CWM) 32b,Open-weight LLM for code generation using training signals from the world model. It will be conducted as part of a study showing how the world model can benefit agent coding. Meta-Research Paper For multitasking inference, we discuss intermediate training of agent trajectories from Python interpreters and RL. CWM research opens the possibilities of world model-based agents for coding and software engineering.

The AI infrastructure investment boom rolls. Openai, Oracle and Softbank have announced five new US Stargate data center sitesunveiled sites in Texas, New Mexico and the Midwest, bringing the program to nearly 7 GW of planned capacity and $400 billion in investments so far.

Openai and Nvidia announce their strategic partnershipNvidia has committed up to $100 billion in investments in Openaiprovides data center chips and GPU systems to OpenAI infrastructure. With this radical partnershipOpenai buys the Nvidia AI system, while Nvidia acquires non-controlled stocks in Openai. The partnership aims to deploy at least 10 gigawatts of computing power over the next few years, with $10 billion in signatures and delivery starting in the second half of 2026.

GPU cloud provider CoreWeave extends Openai agreement with a new $6.5 billion dealwill bring the total total of $22.4 billion for 2018 and will work with Openai's Stargate infrastructure build-out.

Microsoft Microfluidic Chip Cooling for Density AI Data Centers. Lab results suggest up to three times more heat removal than traditional cold plates and 65% cuts of maximum GPU temperature rise, allowing for 3D chip stacks and higher overclocking.

The General Services Bureau has added llamas to the US Government's approved AI tools list for agency usejoining tools from Amazon, Microsoft, Google, Anthropic, and Openai. This allows for the adoption of open models of tasks such as government contract reviews.

A federal judge granted preliminary approval for a $1.5 billion human copyright settlement With leading US authors. The court called the transaction “fair” while booking the final approval.

Spotify has moved to AI's “slop” and spoofing There are new rules to remove deceptive or spam AI music, such as spoofing fraudulent vocals. Spotify aims to balance creator experiments with listeners' trust and rights management, and they collaborate on the industry standards of AI disclosure in song credits.

Elon Musk's Xai has reached an agreement with the US government's purchasing department to sell AI chatbot Grok to the federal government for less than a dollar. The federal agency pays Grok 42 cents a year, weakening Openai and Anthropic's $1 offering, including engineer support.

Source link

Mia commented on Don’t Be Fooled By Data Drift « Machine Learning Times: This is such a valuable viewpoint on data drift in
創建binance帳戶 commented on MEGA sconto del 34% su Amazon: Your article helped me a lot, is there any more re
binance registrering commented on Global Industrial Automation Services Market Size to Reach: Your point of view caught my eye and was very inte
binance commented on WestMetric Defends Controversial On-Page SEO Services for the Era of AI: I don't think the title of your article matches th
创建个人账户 commented on AI in CMO Strategy: Transforming Marketing Leadership: Can you be more specific about the content of your

AI Week 25.09.27 Review

Leave a Reply

RECENT POSTS

Intel (INTC) extends Panther Lake to industrial AI through Kontron partnership

AI won’t transform business unless we redesign work itself

AI and machine learning tools accelerate Alzheimer’s disease drug discovery at Indiana University

Related Posts

Leave a Reply