KAN explodes!

Machine Learning


In late April 2024, researchers from MIT and CalTech published a new AI research paper proposing a fundamentally new approach to machine learning networks: the Kolmogorov Arnold Network (KAN). Six weeks after the paper's publication, the AI ​​research field has been buzzing with excitement and speculation that KAN could be a breakthrough invention that dramatically changes the trajectory of AI models for the better, resulting in dramatically smaller model sizes, orders of magnitude less power consumption, and similar accuracy for both training and inference.

Low computational complexity for training and inference

The progress in the world of AI over the past 18 months has been astounding for both industry and the general public. Generative AI models for generating language and images have captured public attention. Business publications and conferences have speculated about the disruption to the economy and the expected benefits to society. But the enormous computational costs of training and running increasingly large models are also worrying policymakers. According to various forecasts, LLMs alone could consume more than 10% of the world's electricity in just a few years, with no end in sight. With no end in sight, the idea of ​​KANs emerged. Early analyses suggest that KANs could be 1/10th the cost of computing power.Number Until January 20thNumber It achieves comparable results with the same size as traditional MLP-based models.

It's important to note that data center builders aren't the only ones grappling with the massive compute and power required by today's cutting-edge generative AI models. Device manufacturers looking to run GenAI on their devices are also grappling with compute and storage demands that are beyond the price point their products can support. For business executives wondering how to cram 32GB of expensive DDR memory into a cheap phone so they can run a 20B parameter LLM, the idea of ​​a 1B parameter model that fits snugly into existing platforms with only 4GB of DDR is a lifesaver.

Built on a different mathematical foundation

Because this author is more of an aspiring comedian than an aspiring mathematician, he won't go into the mathematical principles underlying KAN or how it differs from traditional CNNs or Transformers. For the more technically inclined, there are already some excellent high-level explanations available, such as this one and this one. However, the key takeaway for business-minded decision makers in the semiconductor industry is that KAN is not built on the building blocks of matrix multiplication. Instead, running KAN inference consists of computing a huge number of univariate functions (think polynomials like this: [ 8 X3 – 3 X2 + 0.5 X ] ) and then add the results. MATMUL is almost never.

Throw out your old NPU and respin the silicon?

Wow! No matrix multiplication in KAN? What about the large silicon area you allocated to fixed-function NPU accelerators in your new SoC designs? Most NPUs have hardwired state machines to perform matrix multiplication, which is the core of convolution operations in current ML models. They also have hardwired state machines to implement common activations (such as ReLu and GeLu) and pooling functions. None of that matters in the KAN world, where the goal of the game is to solve polynomials and perform high-precision ADDs frequently.

GPNPU is the savior

Semiconductor executives who are shocked by KAN may exclaim, “I wish there was a general-purpose machine learning processor with massively parallel, general-purpose computing that can perform the ALU operations that KAN requires.” But such a processor exists. Quadric's Chimera GPNPU uniquely combines both the matrix multiplication hardware required to efficiently run traditional neural networks and a massively parallel array of general-purpose C++ programmable ALUs that can run any machine learning model. For example, Quadric's Chimera QB16 processor combines 8192 MACs with 1024 full 32-bit fixed-point ALUs, giving users a massive 32,768 bits of parallelism, ready to run KAN networks (if the current hype lives up to it) or whatever the next breakthrough invention is in 2027. Future-proof your next SoC design. Learn more at quadric.io.

Steve Roddy

(All posts)

Steve Roddy is Chief Marketing Officer at Quadric.io. He previously served as Vice President of the Machine Learning group at Arm, and prior to that was Vice President of IP Licensing at Tensilica (acquired by Cadence) and Amphion Semiconductor, and has also held product management positions at Synopsys, LSI Logic, and AMCC.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *