Edmunds Prepares for AI by Unifying Data Infrastructure – CIO

Applications of AI


Edmunds, an online resource for automotive inventory and information, has struggled to integrate its data infrastructure for a decade. Now with the infrastructure side of the data house in place, the California-based company envisions a bold new future with AI and machine learning (ML) at its core.

“Most of the integration challenges have been solved,” said Greg Rokita, vice president of technology at Edmunds. “So how do we stay ahead in this AI environment? What foundational framework should we develop to make our product teams more productive and outperform the competition?”

Lokita has been with Edmunds for over 18 years, having been executive director of technology since 2005. His role currently includes responsibility for data engineering, analytics development, vehicle inventory and statistics and pricing teams.

The company started in 1966 as a series of printed buying guides and began offering data via CD-ROM in the 1990s. The move online soon followed. Mr. Lokita participated in the launch of his magazine, the company’s first free online, and a few years later his team launched the company’s first mobile phone app.

The Edmunds website currently provides data on new and used vehicle prices, dealer and inventory listings, national and local incentives and rebates databases, vehicle reviews and advice on buying and owning a vehicle. The company was acquired by CarMax for $404 million in 2021.

One of the ways Rokita is trying to stay ahead in the AI ​​space is by creating a new ChatGPT plugin that exposes Edmunds’ unstructured data (vehicle reviews, ratings, editorials) to generative AI.

OpenAI, which developed ChatGPT, trained a generative AI on a corpus of billions of public web pages called Common Crawl. But in a world that moves at the speed of the internet, data gets stale very quickly. The idea behind Edmunds’ new plugin is to give ChatGPT the ability to draw from a large collection of professional and constantly updated data.

“If you ask, ‘How does the Toyota Camry 2022 run?’ you will get nothing,” says Lokita. “By developing a plugin, you are publishing the latest data.”

Edmunds hopes that generative AI users who need vehicle details and photos will click links to the site and drive more traffic.

At Rokita, we strongly believe that we are at a new tipping point, much like the Internet revolution of the 2000s that transformed nearly every industry.

“Twenty or thirty years ago, the Internet was in every business,” says Lokita. “We believe the same thing is happening with AI right now. AI will be built in-house to optimize the method. Soon.”

Unless AI becomes part of the company’s fabric, Edmunds will fall behind.

“One of the challenges for my team is creating a framework and getting the company moving on that path,” he says.

Lokita believes the key to that transition is to stop thinking of data warehousing and AI/ML as separate departments with different systems.

“People need to understand that these are actually different manifestations of the same system,” Lokita said. “Data warehouses are about past data, models are about future data. Imagine a table where past behavior and predicted future behavior are all he becomes one timeline.”

This thinking drove Rokita’s determination to consolidate the Edmunds data infrastructure, and like many companies that were early to recognize the benefits of new data technologies, the Edmunds data infrastructure has grown into a suite of best-of-breed point solutions. bottom.

“We started with a purpose-built data warehouse built on Oracle racks and have progressed through purpose-built systems like Netezza and Teradata,” he says. “Previously he would process data in Hadoop and load it into Netezza for people to query.”

About ten years ago, Rokita decided to start consolidating its infrastructure. The first step was moving to the cloud. The team replaced his Netezza with his Amazon Redshift, then added his Databricks cloud platform for data science and AI. However, the integration was still far from complete. With different systems for data science, data warehousing, and data processing, the team still had to worry about data getting out of sync.

“When you work with analysts to look at data in two different places and the data doesn’t match, you lose credibility,” Lokita says. “It’s important that people in your organization have a consistent view of the data.”

As Databricks added new data warehousing capabilities to their platform, Rokita decided to move away from Redshift and Hadoop and instead do everything with Databricks as a layer on top of AWS. Rokita says the change not only reduced costs, but made operations easier to manage.

“We now have one system that handles both data processing and serving, with the added benefit of being able to build models on top of data without duplicating it,” he says.

Rokita and his team are now using one of Databricks’ newest features: the Databricks Marketplace, a marketplace for data, AI models, and applications. As part of the service, Databricks curates and publishes open source models across common use cases such as following instructions and summarizing text. Third party his data his providers such as S&P Global, Experian, Accuweather and LexisNexis are also participating in the market.

Rokita believes the ability to join third-party data to Edmunds data with the click of a button, without the need for development time, will open up new horizons for the company and its use of analytics and ML. increase.

“For example, you can search for what you want, like demographic data of potential car buyers, and use that in your advertising campaigns,” he says. “Just click the box and this dataset will appear in his Databricks.”

In particular, Edmunds’ parent company Carmax runs its own instance of Databricks, which runs on Microsoft Azure, while Edmunds’ instance runs on AWS, he points out. I’m here. With Marketplace, there is no need to consolidate infrastructure.

“Often we want to share data with each other,” he says. “Now, with no development costs, we can share datasets with them and they can share datasets with us. I am also really excited about.”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *