May 9, 2024 — data brick We recently announced the general availability of Vector Search and a major update to Model Serving that makes it easy for enterprises to build high-quality RAG applications with native functionality available directly in Databricks Data Intelligence Platform. In this recent blog post, Akhil Gupta and Oliver Chiu discuss the practical implications of these updates and how they streamline the deployment of AI applications in the enterprise space.
In December, we announced a new suite of tools for bringing generative AI applications into production using Acquisition Augmentation and Generation (RAG). Since then, his RAG applications have been built explosively by thousands of customers on the Databricks Data Intelligence Platform.
Today we are making several announcements to make it easier for enterprises to build high-quality RAG applications with native functionality available directly in Databricks Data Intelligence Platform, including general availability of Vector Search and major updates to Model Serving. I'm happy to be able to do this.
The challenge of high-quality AI applications
As we work closely with customers to build and deploy AI applications, we have recognized that our biggest challenge is achieving the high standards of quality required for customer-facing systems. Developers spend significant time and effort ensuring that the output of AI applications is accurate, secure, and controlled before making them available to customers, and they often cite precision and quality as the biggest obstacles to unlocking the value of exciting new technology. .
Traditionally, the main focus for maximizing quality has been to deploy LLMs that provide the highest quality baseline inference and knowledge capabilities. However, recent research shows that the quality of the base model is only one of many factors that determine the quality of AI applications. LLMs without enterprise context and guidance will still hallucinate because they don't understand the data well enough by default. AI applications can also expose sensitive or inappropriate data if governance is not understood and access controls in place.
AI systems approach to quality
Achieving production quality with GenAI applications requires multiple tools that cover all aspects of the GenAI process, including data preparation, acquisition models, language models (SaaS or open source), rankings, post-processing pipelines, rapid engineering, and training. A comprehensive approach that includes components is required. About custom enterprise data. Together, these components make up an AI system.
Today, we're excited to announce major updates and details to help customers build production-quality GenAI applications.
- General availability vector searchis a serverless vector database purpose-built for customers to power LLM with enterprise data.
- Scheduled for public release in the coming weeks Model provision infrastructure model API This allows you to access and query state-of-the-art LLMs from the endpoints they serve.
- Major updates Providing the model
- New user interface makes it easier than ever to deploy, serve, monitor, manage, and query LLM
- Support for additional cutting-edge models – Claude3, Gemini, DBRX, Llama3
- Deploying LLM at scale and improving query performance
- Support for inference tables across all types of service endpoints improves governance and auditability.
We also previously announced the following to help you deploy production-quality GenAI:
This week, we'll be posting a detailed blog about how to use these new features to build high-quality RAG apps. We'll also share an insider blog about how we built DBRX, an open general-purpose LLM created by Databricks.
sauce: Akhil Gupta and Oliver Chiu, Databricks