Lakehouse company Databricks has announced that it has entered into a definitive agreement to acquire Okera, an AI-centric data governance platform.
Okera aims to solve data privacy and governance challenges across the spectrum of data and AI. It simplifies data visibility and transparency and helps organizations make sense of their data. This is essential in the age of LLM and to address concerns about bias.
Databricks co-founder and CEO Ali Ghodsi commented: An integrated, AI-centric governance experience powered by the Unity Catalog. ”
“Databricks has been dedicated to democratizing data and AI for organizations around the world for 10 years. I’ve been asked how I can leverage data in ways that power AI and Large Language Models (LLM) in their business. Ask how to do it.”
Co-Founder and CEO of Okera, Nong Li is widely known for creating Apache Parquet, the open source standard storage format built by Databricks and other industries.
Nong has also previously played a key role at Databricks. He led vectorized Parquet efforts and code generation efforts, delivering a 10x performance improvement for Apache Spark 2.0.
Li commented:
“As the amount, velocity, and variety of data in various applications continues to grow, CIOs, CDOs, and CEOs must balance these two often-conflicting initiatives. Managing access policies was a pain, a pain. It took time.”
He continues, “Many organizations do not have enough technical talent to manage access policies at scale, especially with the proliferation of LLMs. What they need is a modern, AI-centric governance solution.
“We are very excited to join the Databricks team and bring our expertise in building secure, scalable and simple governance solutions for the world’s most forward-thinking enterprises.”
According to Databricks, Okera’s data governance platform offers two unique technologies that can meet the challenges of data governance in this new world.
First, Okera provides an AI-powered interface that automatically discovers, classifies, and tags sensitive data such as personally identifiable information (PII). These tags enable data governance stakeholders to easily assess compliance and create no-code access policies that improve data visibility and control.
Okera also provides a self-service portal to quickly audit and analyze sensitive data usage, ensuring organizations can monitor and track data usage patterns. This ensures consistent enforcement of governance policies even as data assets explode, many of which are AI-generated.
Second, Okera is developing a new isolation technology that can support any workload while enforcing governance controls without sacrificing performance.
This technology is in private preview and has been tested by many joint customers, especially for AI workloads. This is the key to enabling companies to efficiently cover the full range of new world applications.
According to the company, Lakehouse is the perfect place to develop data and AI applications together and build an LLM. Databricks lakehouse’s vision centers around consolidating these workloads onto his single platform. At the foundation of our Lakehouse vision is Unity Catalog, the data governance layer for all data and AI workloads.
Databricks can use attribute-based and intent-based access policies to discover, classify, and manage all data, analytics, and AI assets (including ML models and model functions) using AI. Customers say they will benefit.
Additionally, you benefit from end-to-end data observability at Lakehouse. This allows you to centrally audit and report on sensitive data usage across analytics and AI applications and automatically track data lineage down to the column level.
Mr Ghodsi said: “With these enhancements, customers will have a complete view of their data assets across clouds, define access policies using a single permissions model, accelerate their use of AI, and increase their visibility across Lakehouse.” ensure consistent governance.
“This next acquisition will allow us to expose a richer set of policy APIs that other data governance partners can use, enabling us to offer a seamless solution to our customers.”
