Real-time fraud detection at Coinbase

Machine Learning


Gain faster insights at a fraction of the cost

Since introducing RTM, Coinbase has improved its ability to mitigate fraud by ensuring its risk models operate on the most up-to-date transaction data. Latency was reduced to sub-second freshness, achieving 150 ms for aggregated stateless streaming features and 250 ms for aggregated stateful streaming features. Consistency of online and offline functionality has been improved by up to 98%.

This architectural change has enabled the team to achieve incredible scale and speed. Daniel Zhou, Senior Staff Machine Learning Platform Engineer at Coinbase, explains: “By leveraging Spark Structured Streaming’s real-time mode, we were able to reduce end-to-end latency by more than 80%, achieve a P99 of less than 100ms, and streamline our large-scale real-time ML strategy. This performance allows us to compute over 250 ML features, all powered by the integrated Spark engine.”

In addition to improving performance, RTM has enabled Coinbase to retire the previously specialized, mass-provisioned clusters required for micro-batch mode. This fundamentally changed the cost structure, allowing the team to cut computing costs in half.

“In addition to significantly improving data freshness and consistency, we have realized incredible cost savings,” added Wickramarachchi. “We estimate that this architectural change will reduce computing costs by 51% this year alone.”



Source link