DeepSeek proposes migration of AI model development with “mHC” architecture to upgrade ResNet

DeepSeek's latest technical paper, co-authored with the company's founder and CEO Liang Wenfeng, has been cited as a potential game-changer in artificial intelligence model development, as it could lead to improvements in the basic architecture of machine learning.

Manifold-constrained hyperconnections (mHC), the subject of this paper, represent an improvement on traditional hyperconnections in residual networks (ResNets), the fundamental mechanism underlying large-scale language models (LLMs), and demonstrate the continued efforts of Chinese AI startups to train powerful models with limited computing resources.

In the paper, a team of 19 DeepSeek researchers said they tested mHC on models with 3 billion, 9 billion, and 27 billion parameters and found that it could scale without adding significant computational burden.

The paper, published on January 1, immediately sparked interest and discussion among developers despite its detailed technical content.

Professor Quan Long from the Hong Kong University of Science and Technology said the new discovery is “very important for transformer architectures made for LLM”. Quan said he is “very excited to see the significant optimizations made by DeepSeek that have already revolutionized LLM efficiency.”

The paper comes at a time when most AI startups are focused on turning LLM's AI capabilities into agents and other products. But DeepSeek, a side project of Liang's quantitative trading firm, has been seeking to improve the fundamental technical mechanisms of how machines learn from data.

Source link

binance Registrera dig commented on OpenAI And Anthropic Aim For Big Valuation Spikes, Visa Looks To Join Generative AI Gold Rush: Your point of view caught my eye and was very inte
binance Registrera commented on Full Stack/UI Software Engineers (Associate, Mid-Level or Senior) at BOEING: Thank you for your sharing. I am worried that I la
Anm"al dig f"or att fa 100 USDT commented on The rise of Artificial Intelligence in Film & TV: Thanks for sharing. I read many of your blog posts
注册免费账户 commented on Top 4 Insights All C-Suite Leaders + Market Researchers Need To Be Aware Of In 2024: Thanks for sharing. I read many of your blog posts
binance h"anvisning commented on Bluebirds bags $5M for sales AI that finds the ‘best leads first’: I don't think the title of your article matches th

DeepSeek proposes migration of AI model development with “mHC” architecture to upgrade ResNet

RECENT POSTS

Yeshiva University to focus on AI, Jewish tradition, and moral clarity starting October 7th

Small AI startup hires personal chef to reduce costs and increase productivity

PayAi-X FZE Announces CatyAI V3.0, Delivering Cryptographically Verifiable AI Data Infrastructure for Enterprise Governance

Related Posts