DeepSeek unveils new AI model as China promotes technological autonomy

Machine Learning


DeepSeek unveils new AI model as China promotes technological autonomyDeepSeek unveils new AI model as China promotes technological autonomy


DeepSeek, the Chinese startup that surprised the world last year with its low-cost AI models, on Friday announced a preview of its long-awaited new model adapted to Huawei’s chip technology, underscoring China’s growing autonomy in the field.

According to DeepSeek, the Pro version of the new model outperforms other open source models on the World Knowledge Benchmark, trailing only Google’s closed source Gemini-Pro-3.1.

The close collaboration with Huawei on the new V4 model stands in contrast to DeepSeek’s past reliance on Nvidia AI chips. Huawei said its chips were used for part of V4’s training process.

“This is a big deal for China’s AI industry,” said He Hui, director of semiconductor research at consultancy Omdia.

“Huawei’s Ascend chips are the country’s best domestic alternative to Nvidia, and DeepSeek V4 support shows that China’s top AI models can run on Chinese hardware.”

Most major AI models are trained and run on chips made by Nvidia. DeepSeek’s pivot to Huawei also highlights concerns raised by Nvidia CEO Jensen Huang that U.S. companies risk losing their developer ecosystem in China due to U.S. export controls and Beijing’s push for self-sufficiency.

“The day Deep Seek first rolls out on Huawei, that’s a terrible outcome for our country,” Huang said on a podcast this month.

Lewis Tunstall, machine learning engineer at Hugging Face, says V4 is the fastest model to take the top spot on Hugging Face, a popular developer forum for sharing and running machine learning models.

It’s great at handling very long and complex text tasks, and it’s much cheaper to run than competing top models, but it does have some limitations. For example, multiple modalities such as images and video are not supported, Tunstall said.

Close collaboration between Huawei and DeepSeek
DeepSeek has drawn criticism from Washington and America’s rivals, who attribute much of its success to improper use of American know-how.

Meanwhile, DeepSeek has acknowledged the use of Nvidia chips, but has not said whether those chips are subject to an export ban. The company also said it did not intentionally use synthetic data generated by OpenAI.

Friday’s announcement comes a day after the White House accused China of stealing intellectual property from U.S. AI labs on an industrial scale and ahead of U.S. President Donald Trump’s trip to Beijing next month to meet with Chinese leader Xi Jinping.

The Trump administration gave the green light in January to sell Nvidia’s powerful H200 chip in China, but shipments have been hampered by disagreements over terms of sale in both China and the United States, officials said.

Chinese chip makers rose on expectations for the spread of domestically produced chips, with Hua Hong Semiconductor and SMIC rising 15% and 10%, respectively.

NVIDIA stock also rose after Intel predicted unexpectedly strong revenue and profits, reinforcing confidence that the AI ​​boom shows no signs of slowing down.

DeepSeek currently faces many competitors
Many Western countries and some Asian governments have banned the use of DeepSeek by their agencies and officials, citing data privacy concerns. Nevertheless, DeepSeek’s models have always been most popular on international platforms hosting open source models.

Despite rocketing to national champion status in China a year ago, that lead has evaporated amid a slew of competitive offerings from domestic rivals. The release of V4 sent rival stocks plummeting, with Zhipu AI and MiniMax both down 9%.

DeepSeek said Friday that V4 is particularly suited to the work of AI agents, which can perform more complex tasks than chatbots but require more computing power.

How successful it will be remains to be seen.

“My initial take is that the DeepSeek V4 preview looks significant, but until there is independent evaluation and more real-world developer testing, I would be cautious about taking the benchmark headlines at face value,” said Daniel Dewhurst, an AI engineer who tested V4 after its release.

Notably, however, V4 shows that open models that people can use and run themselves appear to be further closing the gap with closed models, especially when it comes to cost, long context, and coding, he said.

It can process over 1 million tokens, which is comparable to OpenAI’s GPT-5.4 and Anthropic’s Claude Opus 4.6 context windows, but requires only a fraction of the compute to do so.

V4 also has a lower cost Flash version. Preview versions allow companies to incorporate real-world feedback and make changes prior to final product launch. DeepSeek did not say when the model is expected to be completed.

DeepSeek, owned by China’s Highflyer Capital Management, is aiming to raise capital at a valuation of more than $20 billion, and tech giants Alibaba and Tencent are also in talks to acquire stakes, The Information reported this month.



Source link