Neuchips focuses on inference for future AI applications

The global semiconductor market experienced a difficult year in 2023.According to the Semiconductor Industry Association (SIA), global chip sales are $526.8 billion In 2023, it decreased by 8.2% compared to the previous year.

Apart from the business cycle in the IC industry, a significant decline in the memory sector contributed to this poor performance. Memory product revenue fell 37% last year, the largest decline of all segments of the semiconductor market, according to market analyst Gartner. Nevertheless, there were positive signs in the second half of the year, led by the AI sector. . A new wave of AI will begin in 2023 with the growth of AI-based applications in many sectors including data centers, edge infrastructure, and endpoint devices.

According to market analyst Counterpoint Technology Market Research, AI has brought positive news to the semiconductor industry, emerging as a key content and revenue driver, especially in the second half of 2023.

In fact, AI is expected to lead the semiconductor recovery in 2024. According to Gartner, AI chips $53.4 billion In 2023, the semiconductor industry's revenue opportunities will increase by approximately 21% compared to the previous year. Continued double-digit growth is expected in this field. $67.1 billion In 2024, the market size will grow to more than double the size of 2023. $119.4 billion By 2027.

“There are a lot of opportunities in the AI space,” he says. Ken Lau, CEO of AI chip startup Neuchips. “When we look at public data, we see that AI, especially generative AI, [GenAI], could become a $1 trillion market by the 2030 time frame. There's actually a lot of money being spent on training right now, but we're going to see investment in reasoning in the second half of the decade. ”

Lau said that going forward, different usage models for inference will be considered. “Once you train your data, you can make inferences to help you do your job better. For example, various companies plan to use AI to enhance their chat bots and customer service capabilities. The same goes for how a particular brand speaks: When a consumer asks a question, the spokesperson can explain the brand to the customer. When you click on that brand, you become interested in that brand and go to a website where you can buy the product,” he explains. “I think there will be ways down the road that we can't even imagine. The possibilities for AI are limitless, I think. And a big part of that is going to be inference, not just training.”

focus on reasoning

Founded in 2019, Neuchips recognizes the important role that inference will play in the future and has set its sights on inference, specifically recommendation engines.

One rationale behind this is that many data centers use recommendation engines. “When you buy parts or products online, they recommend something to you. For example, if you buy this brand of tennis racket, they also recommend another brand,” Lau says.

So Neuchips selected a recommendation engine, built a prototype using an FPGA, proved the design worked, and then designed the chip.

The 2022 inference chip N3000 turned out to be very good, proving to be 1.7x better than the competition in the market in terms of performance per watt based on MLPerf 3.0Benchmarking.

“When we built this chip, we had a recommendation engine in mind. We built it for recommendations,” Lau explains. “But when GenAI turned a corner, we tried it on a chip and were able to reproduce it because the memory subsystem is optimized for recommendation engines. The same memory subsystem can be applied to GenAI as well, and our demos at the US AI Hardware Summit and SC23 will showcase our demo cases by letting users try out our own chips on chatbots. One of the few AI companies to do so.”

At the recently held EE Awards Asia 2023, Neuchips' N3000 won the “Best AI Chip” award. “This shows the level of execution we can do here. Taiwan“If you look at the big companies that are doing chip design today, they're not doing the core logic,” Lau says. They use smaller chips. We are one of the few companies that have adopted 7nm for computing. That's why it's important. And we were able to achieve a recommendation performance that is 1.7 times better than others. There's something to be said about that. ”

Lau proudly says he built the device in just one slice. “Other companies can make multiple cuts to get their chips right. For our N3000 product, we only have one chance because we're only a startup. We don't have any money to waste. So we did it with one chance and it worked, which I think is an important accomplishment and a reflection of our level of execution.”

Industry challenges

Despite optimistic estimates, the AI semiconductor segment continues to face many challenges depending on the customer and their applications.

“There are companies out there who want to integrate AI into their product portfolios or embed AI into their services,” Lau explains. “One of the challenges here is the software integration part. And how do you train internal data? For example, if I'm a hospital, all my data sets need to be private. We can't move it to the cloud. How can we use that data? Train doctors to access it in a more meaningful way?”

According to Lau, training on these data at the enterprise level can be important. Because hospitals, for example, don't hire software engineers just to train them on data.

“Their data is private, so they're going to need that kind of software services and hardware in-house,” Lau points out. Along with this, he expects the enterprise sector to pick up as well.

Another challenge that continues to plague the chip industry is power. And since AI chips have high computing power, they cannot escape this problem.

“It depends on what kind of edge devices you want to connect to,” Lau says. “First of all, our chip can go down to about 25 W to 30 W. The standard is about 55 W, but we were able to compress it into a dual M.2 form factor, so we can get it down to about 25 W to 30 W. For example, if you have a passive heatsink and a fan, you can put it in a PC without any problem, but it might still be a bit big, but I'm not going to put it in a laptop. But that doesn't stop people from building docking stations that can connect to laptops as GenAI devices.

Meanwhile, to help customers address their challenges, Neuchips comes from two different angles: hardware and software.

“For one, we provide the hardware. For data centers, we don't need high-power connections,” Lau says. “Our chips have low power consumption and can be installed in even the smallest of spaces. Our products can be installed in 1U servers and desktops using cards in a variety of form factors. We also provide the entire software stack, SDK. [software development kits], as well as drivers and everything else. ”

Neuchips can also provide data services integration or training to its customers. “If we train using our own data, give it back, and provide the hardware, it becomes more efficient. This creates a win-win situation for us and our customers.” Lau says.

Future Plans

Lau said training and edge applications will be the main drivers of AI applications in the future.

“But looking at all the news today, I think when it comes to AI PCs, some of the new application providers will come up with new ways to do GenAI inference,” he says. “We are in uncharted territory and we expect this to grow, but at the same time the application ecosystem needs to grow as well.

Going forward, Neuchips will focus on different form factors. Apart from the dual M.2 form factor device, the company also has another module that can be plugged into a standard PCI Express slot for PC or low-end workstation applications.

Source link