Chinese technology giant Alibaba has announced its first suite of artificial intelligence models for robots, according to a company blog post on Monday (June 15).
Qwen Robot Suite comes as AI companies move from chatbots to physical AI.
“The fundamental models of the Qwen family already provide powerful perception and reasoning about the physical world,” the post states. “But seeing is not doing. The gap between vision, language understanding, and bodily control remains the central bottleneck for embodied intelligence.”
According to the post, the three models in the Qwen Robot Suite bridge this gap. Qwen-RobotNav helps you understand how robots move through physical space. Qwen-RobotWorld, on the other hand, is a video “world model” that allows robots to predict how physical scenarios will play out.
Finally, Qwen-RobotManip “transforms disparate robot data into a consistent standard space, enabling cross-entity training at scale,” the post states.
“Combined, these enable agent systems where general intelligence is directly translated into physical behavior,” the post says.
Advertisement: SCROLL TO CONTINUE
The announcement comes a week after reports that Alibaba Group has established a new business unit known as Token Foundry following a reorganization to strengthen its AI efforts.
Led by Alibaba CEO Eddie Wu, the Token Foundry will combine Alibaba’s Tongyi Lab and Future Life business units and operate under the company’s recently established Alibaba Token Hub.
In other physics AI news, Nvidia’s Cosmos 3 base model for physics AI was announced last week. “Thanks to breakthrough advances in multimodal reasoning languages, vision, and world models, the big bang of physical AI is just around the corner,” Nvidia founder and CEO Jensen Huang said at the launch.
This distinction is important for anyone building or deploying physical AI. Large-scale language models learn from text, while world-based models learn from the physical environment.
For robots, that means leveraging millions of interaction examples to learn how to handle objects, whereas self-driving cars need to be exposed to rare and dangerous scenarios that cannot be collected safely or cheaply at scale on public roads.
“The World Foundation model solves this problem by generating synthetic training data that reflects real-world physics,” PYMNTS reported at the time. “Instead of driving test vehicles for years, self-driving car developers can run millions of simulation scenarios in days.”
For all of our coverage of PYMNTS AI, subscribe to our daily subscription AI Newsletter.
