How Databricks is adding generative AI capabilities to its Delta Lake lakehouse

AI and ML Jobs


Credit: Dreamstime

It’s been a busy few weeks for Databricks. After releasing a new iteration of Data Lakehouse with a universal table format and introducing Lakehouse Apps, the company announced Wednesday a new tool aimed at helping data professionals develop generative AI capabilities.

New capabilities will be added to the company’s Delta Lake lakehouse, including a proprietary enterprise knowledge engine called LakehouseIQ, new vector search capabilities, a low-code large language model (LLM) tuning tool called AutoML, and open source foundation models . .

The new feature leverages technology from the company’s recent acquisitions (MosaicML this week and Okera in May).

LakehouseIQ enables enterprise search with NLP

The new LakehouseIQ engine aims to enable enterprise users to search for data and insights from Delat Lake without seeking technical assistance from data experts. To simplify data retrieval for non-technical users, the LakehouseIQ engine uses natural language processing (NLP).

To enable NLP-based enterprise search, LakehouseIQ uses generative AI to understand concepts like terminology, data usage patterns, and organizational structure.

This is a different approach than the common technique of building knowledge graphs used by companies like Glean and Salesforce. A knowledge graph is a representation of structured and unstructured data in the form of nodes and edges. Nodes represent entities (people, places, concepts, etc.) and edges represent relationships between these entities.

In contrast, according to Sanjeev Mohan, principal analyst at SanjMo, the LakehouseIQ engine consists of machine learning models that infer the context of data sources and make it available for search with natural language queries.

Enterprise users will be able to access LakehouseIQ’s search capabilities via Assistant in Notebooks and SQL Editor, the company said. Assistants can perform a variety of tasks, such as building queries and answering data-related questions.

Databricks said it is adding LakehouseIQ to many management functions within Lakehouse to provide automated suggestions. This may include notifying the user of incomplete data sets and suggestions for debugging jobs and SQL queries.

Additionally, the company is exposing LakehouseIQ’s APIs so that custom applications that companies develop can leverage its capabilities, said Joel Minnick, vice president of marketing at Databricks.

Assistant powered by LakehouseIQ is currently in preview.

Delta Lake Gets AI Toolbox for Developing Generative AI Use Cases

Databricks said the addition of the Lakehouse AI Toolbox to Lakehouse is intended to support the development of enterprise-generated AI applications, such as creating intelligent assistants. The toolbox consists of features such as vector search, low-code AutoML, a collection of open source models, MLflow 2.5, Lakehouse Monitoring, and more.

“By embedding files that are automatically created and managed in the Unity Catalog, as well as the ability to add query filters for searches, vector search helps developers improve the accuracy of generated AI responses,” says Minnick. He said, adding that the embedding will continue to be updated using Databricks. model service.





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *