Just as Databricks developed open standards in 2021 to enable enterprises to securely share data with internal and external partners, the vendor introduced OpenSharing to enable organizations to share their AI assets.
Five years ago, Databricks built Delta Sharing, a subproject within the open source Delta Lake project, to enable enterprises to share and collaborate on data without moving or duplicating it.
OpenSharing, released on June 10 and available on GitHub, is hosted by the Linux Foundation and is an extension of Delta Sharing. New open source standards enable organizations to further collaborate across platforms, departments, and partners by sharing AI models, agent skills with expertise and workflows, and unstructured data.
Additionally, OpenSharing expands collaboration by adding support for platforms that connect to the Apache Iceberg REST catalog. This enables new cross-organizational sharing and adds partnerships with on-premises storage partners to enable non-moveable sharing of on-premises data and AI assets.
“OpenSharing represents a shift from simple data exchange to a unified, managed interface for AI and data stacks,” William McKnight, president of McKnight Consulting, told TechTarget. “Beyond traditional tables… this framework provides a blueprint for studying and extending how autonomous agents interact with distributed data, which could be extremely important for data sharing.”
Stephen Catanzano, an analyst at Omdia, a division of Informa TechTarget, similarly called OpenSharing an important development.
“OpenSharing is a solid development in the AI infrastructure space,” he told TechTarget. “This is especially important because it extends secure zero-copy sharing beyond structured data to include agent skills and AI models, assets that are becoming critical in the agent era. Until now, organizations have not had a standardized way to share these AI components across platforms.”
San Francisco-based Databricks pioneered the data lakehouse format for storing data. Like many data management vendors, Databricks has had machine learning capabilities since its founding in 2013, but in recent years has focused much of its product development on enabling customers to build generative and agent AI capabilities.
AI sharing
Collaboration within a single department, across organizational domains, or with third-party partners is an important way to accelerate innovation and improve productivity.
OpenSharing marks the transition from simple data exchange to a unified, managed interface for AI and data stacks. Beyond traditional tables…this framework provides a blueprint for studying and extending how autonomous agents interact with distributed data. This can be very important for data sharing.
William McKnightPresident of McKnight Consulting
Historically, collaboration has been driven by data. As a result, many data management and analytics vendors have built collaboration capabilities into their platforms, allowing remote employees to collaborate in virtual hubs, especially during and in the immediate aftermath of the COVID-19 pandemic.
Today, more companies are incorporating AI into collaboration, building agents and other AI tools to assist employees and perform specific business processes. Collaborative projects are not only built on data and data products such as reports and dashboards, but also include agents and other AI assets.
But without a standard way to share AI assets, organizations must mix and match their own methods and repeat the process over and over again as departments or companies try to collaborate with departments or companies that don’t use the exact same tools.
According to Akram Chetibi, Director of Product Management at Databricks, OpenSharing was developed to provide a standard and repeatable way to share AI and enable collaboration.
“AI has created a problem that no one has really solved yet,” he told TechTarget. “When you look at how organizations are starting to share across company boundaries, not just data tables, but AI models, agent skills, prompts, tools, etc., there was no standard way to do it. Everyone was combining their own approaches.”
Chetibi continued that OpenSharing is a logical extension given Databricks’ experience developing Delta Sharing.
“We had already seen this problem occur with structured data, even before Delta Sharing existed, and we didn’t want it to repeat itself in the AI world,” he said. “OpenSharing solves this problem by providing organizations with a single open protocol to publish and consume AI assets, regardless of the platform on which either side is running.”
Specific benefits of OpenSharing include:
An open protocol for publishing and sharing data and AI assets. No need to manually copy files for each collaboration effort.
APIs for discovery, authorization, and access, regardless of the platform that different departments and organizations use to manage their data and AI tools.
Support for the Apache Iceberg API extends the reach of Delta Sharing to organizations using Iceberg native tools.
Integration with on-premises platforms and private clouds provides the same collaboration capabilities as using the public cloud for enterprises that choose to maintain their AI operations in a more secure environment than the public cloud.
Catanzano noted that as agent-based AI systems become more prevalent, shared requirements for collaboration are evolving from simply structured datasets to include new complex assets. OpenSharing addresses some of the resulting challenges, including multi-cloud and hybrid infrastructures where AI assets are distributed.
“OpenSharing addresses this new complexity by providing a unified protocol that works across these fragmented environments, enabling the kind of seamless AI collaboration that the current landscape demands but was previously unable to support,” said Catanzano.
Additionally, McKnight said a standardized way to share data and AI assets is beneficial, given that agents themselves are becoming some of the primary consumers of data, executors of workloads, and collaborators across domains.
“These systems read the data and the underlying metadata, such as model weights and execution code. OpenSharing treats this like a single managed package,” he said. “Furthermore, with modern enterprises operating at high speeds and resulting in many lakehouses and open data formats, organizations need open, vendor-neutral standards to share assets without the costs and delays associated with copying data.”
Standardization of documents
In other open source development news outside of OpenSharing, the LF AI & Data Foundation announced the creation of the DocLang Specification Working Group. The group was founded to develop DocLang, an open, AI-native document format that standardizes the preparation, exchange, and management of document data for AI systems.
Catanzano pointed out that most of a company’s knowledge is stored in documents such as PDFs, Word files, and presentation slides. Extracting and operationalizing such document data for AI can be difficult and time-consuming, slowing down AI efforts. The DocLang working group is therefore addressing a critical bottleneck in AI adoption, Catanzano said.
“Combining the standard DocLang with the open source processing toolkit Docling creates a complete stack of document AI that has the potential to accelerate enterprise AI adoption by making document understanding more deterministic and interoperable across systems,” he said. “This is especially timely as agent AI systems are increasingly required to process unstructured corporate documents at scale.”
Meanwhile, McKnight theorized that the creation of the working group signals the growing importance of AI-native document data.
“This launch is where the industry moves from fragmentation to collaboration to defragment,” he said. “This is the beginning of a foundational transition to AI-native documents and interactions.”
While the efforts to build OpenSharing and DocLang are both valuable individually, there is a continuing trend toward developing open standards that enable the development, deployment, and management of AI.
The Model Context Protocol is an open-source code set released by Anthropic in November 2024 that standardizes how AI models connect to an organization’s proprietary data sources and has become widely adopted. The Agent2Agent protocol, introduced by Google Cloud in May 2025, provides a standard for agent interactions.
According to Chetibi, using open source capabilities to build and manage agents and other AI tools is valuable because it allows companies to remain flexible. That’s why Databricks chose to open source OpenSharing rather than keep it a proprietary feature within the broader Databricks platform.
“Customers and partners don’t want to work with their data and AI assets while being confined to a single vendor’s proprietary ecosystem,” said Chetibi. “The reason it doesn’t stick is because it constrains innovation.”
Eric Avidon is a senior news writer at Informa TechTarget and a journalist with more than 30 years of experience. He is responsible for analysis and data management.