Jabez Eliezer Manuel, Senior Principal Engineer at Booking.com, spoke at QCon London 2026 on Behind the Scenes of Booking.com’s AI Evolution: The Unsophisticated Story, where he discussed how Booking.com has evolved over the past 20 years and the challenges it faced in implementing AI.
In honor of the 20th QCon London, Manuel began his presentation on 2005 technology. In particular, the Motorola Razr V3 was a popular phone. Web 2.0 was beginning to emerge. And Booking.com was in its 9th year.
In February 2005, Booking.com launched its first A/B testing experiment, running over 1,000 experiments in parallel for a total of 150,000 experiments. However, the success rate was observed to be less than 25%. Manuel said the goal is incorrect. It was to learn quickly. These experiments ultimately led to the construction of data-driven DNA.
Manuel’s presentation covered three layers: data management, machine learning engineering, and domain intelligence.
data management
Booking.com’s original technology stack was built on Perl libraries and MySQL, which provided asynchronous replication and commercial support. In 2005, we had only one master database, which grew to approximately 6,800 database instances in 2020. MySQL’s setup is also unique because it has no dedicated hardware, stored procedures, universal disk format (UDF), database views, or caching layers.
Their “secret sauce,” as Manuel characterized it, consisted of a small database (with a 2TB limit) that fit on a Non-Volatile Memory Express (NVMe) solid-state drive. They observed point queries taking less than 350 microseconds.
This model was successful until the data became too large. To solve this, Booking.com added Apache Hadoop for distributed storage and large-scale processing. By 2011, the company had two on-premises Hadoop clusters, each containing approximately 60,000 cores and 200 PB of hard disk space.
Hadoop powered machine learning pipelines for years until cracks in the system were discovered. From a machine learning scientist’s perspective, these cracks include: Noisy neighbors where one bad query clogs the cluster. GPU is not supported. and capacity issues that cause overloads and outages at peak times. The decision was made to deprecate Hadoop by 2018, but the process of upgrading and moving away from Hadoop took about seven years.
Booking.com’s migration strategy had five phases.
- Map the entire ecosystem.
- Analyze usage and reduce scope.
- Apply Google Search PageRank algorithm.
- Move in waves.
- Phase out Hadoop.
Manuel said the key to success lies in a unified command center.
machine learning engineering
The evolution of Booking.com’s machine learning stack started with Perl libraries and MySQL in 2005 and culminated in the agent system in 2025. In between, there was Apache Oozie with Python, Apache Spark with MLlib, H2O.ai, deep learning, and GenAI.
Manuel claimed that 2015 was a pivotal year for Booking.com as it solved two core problems. It’s real-time prediction using large-scale online inference. Feature engineering for training and inference.

As of 2024, the company’s current machine learning inference platform has more than 480 machine learning models, 400 billion predictions per day, and less than 20 milliseconds of latency.
domain intelligence
Manuel discussed four domain-specific machine learning platforms and their respective use cases. The first three are: Genai Use cases include trip planning, smart filters, and review summaries. content intelligenceMachine learning content hub for image and review analysis and text generation, with use cases such as detailed hotel content. and Recommendations We’ll show you a use case for displaying personalized content for your customers.
Rankingthe fourth domain-specific machine learning platform for personalized real-time orders was a more complex task. Booking.com’s three-way optimization challenge included choice and value. exposure and growth. and efficiency and profitability.
Their 2005 ranking formula was a simple function with parameters such as reservations and viewership, plus a random number function. They tried to evolve the formula by considering factors such as cancellations, distance-based rankings, availability, and hotel impressions. When they tried to replace the ranking formula with machine learning, they found that due to infrastructure limitations, the formula was, to borrow Manuel’s characterization, “invincible.”
Their experiments typically lasted two to four weeks, but improvements were sought. They adapted their A/B testing experiments to incorporate an interleaving technique, essentially interweaving 50% of each set of experiments into a single experiment. This allowed more variants with less traffic. So I decided to preselect with interleaving and validate with A/B testing.

Manuel concluded his presentation with how domain-specific platforms integrate with the orchestration layer.
