Cryptocurrency machine learning “iPhone moment” approaches as AI agents trade the market

AI-powered trading has yet to reach an “iPhone moment” where everyone carries an algorithmic reinforcement learning portfolio manager in their pocket, but experts say it's coming.

Indeed, the power of AI comes into play when faced with the dynamic and adversarial realm of trading markets. No amount of data or modeling can predict the future, unlike an AI agent that draws information from the infinite circuits of a self-driving car and learns to accurately recognize signals.

This makes refining AI trading models a complex and demanding process. The measure of success was typically measuring profit and loss (P&L). However, advances in how algorithms are customized have led to agents continually learning how to balance risk and reward when faced with different market conditions.

Being able to use risk-adjusted metrics such as the Sharpe ratio to inform the learning process increases the sophistication of the test many times over, said Michael Sena, chief marketing officer at Recall Labs. Recall Labs operates about 20 AI trading arenas where the community submits AI trading agents and those agents compete over a period of four to five days.

“When it comes to scanning the market for alpha versions, next-generation builders are looking to customize and specialize their algorithms to account for user preferences,” Sena said in an interview. “Optimizing for specific ratios, rather than just the raw P&L, is similar to the way large financial institutions operate in traditional markets: What is the maximum drawdown, what is the value that was at risk to create this P&L, and so on.”

Taking a step back, the recent trading competition on decentralized exchange Hyperliquid involving several large-scale language models (LLMs) such as GPT-5, DeepSeek, and Gemini Pro has kind of set the baseline for the place of AI in the trading world. All of these LLMs were given the same prompts, performed autonomously, and made decisions. But according to Sena, they weren't that great, barely outperforming the market.

“We took the AI models used in the Hyperliquid competition and had people submit trading agents they built to compete with those models. We wanted to see if the trading agents with added expertise were better than the basic model,” Sena said.

In the recall contest, customized models took the top three spots. “While some models were unprofitable and performed poorly, we found that specialized trading agents that took these models and applied additional logic and reasoning, data sources, etc., outperformed the base AI,” he said.