After $230M in funding, Positron becomes a unicorn, targeting Nvidia’s Rubin in inference race

Reno-based Positron AI has raised $230 million in a Series B round, valuing the energy-efficient AI inference hardware company at more than $1 billion.

The oversubscribed round was co-led by ARENA Private Wealth, Jump Trading and Unknown, with strategic support from Qatar Investment Authority, Arm and Helena. Existing investors including Valor Equity Partners, Atreides Management, DFJ Growth, and 1517 also participated in the round.

The new funding will accelerate development of Asimov, Positron’s next-generation custom silicon, with tape-out planned for late 2026 and production scheduled for early 2027. Asimov is designed to support up to 2 terabytes of memory per accelerator and significantly larger amounts of memory at the system and rack level.

Positron expects significant revenue growth in 2026 and says it is on track to achieve significant commercial traction approximately 2.5 years after launch and become one of the fastest-growing silicon companies.

Tackle the growing energy challenge with AI

Positron AI, led by Mitesh Agrawal, focuses on one of the biggest challenges facing AI today: the rising cost and power demands of running large-scale models.

While training AI systems has received most of the attention, inference, the process of running models in real-world applications, is rapidly becoming a major energy usage and infrastructure bottleneck.

“We appreciate the enthusiasm of our investors, which itself is a reflection of market demand,” said Mitesh Agrawal, CEO of Positron AI.

“Energy availability is emerging as a major bottleneck for AI deployments, and our next-generation chips deliver 5x more tokens per watt for our core workloads compared to Nvidia’s upcoming Rubin GPUs.Memory is another big bottleneck in inference, and our next-generation Asimov custom silicon is expected to ship with more than 2304GB of RAM per device next year, compared to only a fraction of that for Rubin. 384 GB.

From Atlas Systems to Asimov Silicon

Positron is building an infrastructure layer that enables AI at scale by reducing the cost and power required to run modern models.

The company already ships a current product called Atlas, an inference system designed for rapid deployment and expansion. The company says Atlas is entirely manufactured and manufactured in the United States, allowing customers to quickly increase production capacity with a reliable supply chain.

“Memory bandwidth and capacity are two of the key limiting factors for scaling AI inference workloads for next-generation models,” said Dylan Patel, founder and CEO of SemiAnalysis. “Positron has a unique approach to the memory scaling problem, using next-generation Asimov chips that allows us to deliver more than an order of magnitude faster memory capacity per chip than existing or emerging silicon providers.”

In addition to Asimov, Positron’s roadmap also includes Titan, a next-generation system designed for memory-intensive AI workloads such as long-context language models, video, and agent-based systems.

“For us, speed of development is a key competitive advantage,” says Agrawal. “Competing with Nvidia means matching Nvidia’s shipping frequency, and we designed our organization around that goal.”

Positron is leveraging an ecosystem of industry leaders to build this platform, including Arm, Supermicro, and other key technology and supply chain partners.

“As AI inference scales, efficiency and system design become more important than raw benchmarks,” said Eddie Ramirez, vice president of go-to-market for Arm’s Cloud AI business unit. “Positron’s memory-centric approach, built on Arm technology, reflects how tightly coupled systems and a broad ecosystem work together to deliver scalable performance-per-watt improvements in next-generation AI infrastructure.”

“Positron solves one of the most important bottlenecks in AI: enabling inference at scale within real-world power and cost constraints,” said Ari Schottenstein. “The combination of today’s shipping traction with Atlas and a trusted path to Asimov creates a unique opportunity to define a new category in AI infrastructure.”

“For the workloads we care about, the bottleneck is memory and power, not theoretical compute,” said Alex Davies, chief technology officer at Jump Trading. “In our testing, we found that Positron Atlas delivered approximately three times lower end-to-end latency than comparable H100-based systems for the inference workloads we evaluated in an air-cooled, production-ready footprint with a plannable supply chain.”

Source link