Open AI practices involve openly sharing AI models, the provenance of training data, and the underlying code; in closed AI, one or more of these are kept hidden or protected.
There are many practical reasons why companies might adopt one approach or the other. Closed AI tends to be faster and is available through a variety of cloud services. Open AI is not as fast, but the underlying code, models, and data are subject to greater scrutiny, often leading to better explainability and security. Additionally, having open data sources can potentially protect companies from intellectual property and copyright infringement in an evolving legal landscape.
While both approaches have their merits, the decision to use open or closed AI is hotly debated, and major players tend to strongly oppose one side or the other. For example, regulators concerned about favoring political opponents tend to oppose open AI, as do vendors looking to maintain a competitive advantage. But for researchers looking to build on the latest innovations in the field and make further discoveries, open AI is essential. Still, some companies are taking a hybrid approach, offering some parts of their AI technology as open and keeping others proprietary.
Open AI’s Role in Generative AI
The development of generative AI shows how complex the decision between open and closed AI can be. The generative AI applications that have taken the world by storm built on the open AI approach that Google took in developing the Transformer model, the deep learning algorithm that revolutionized natural language processing.
Meanwhile, OpenAI, a research institute and commercial company, improved Google's algorithms and incorporated them into its popular ChatGPT service, which it offered as a closed service. Google then decided to scale back its Open AI efforts in order to benefit more from its advancements.
Meta has gone in the opposite direction, sharing its Large Language Model Meta AI (Llama) model under a semi-open license. Meanwhile, US regulators concerned about China's dominance in AI have called for public hearings on the use of open AI, and security researchers worry that hackers and spammers will use open AI innovations to harm society. In contrast, researchers are running open AI models directly on phones to build on the latest generative AI innovations.
The future role of open and closed AI in innovation and transformation
How will the push and pull between open AI and closed systems play out? Most experts expect the state of the art in AI will continue to be driven by open AI principles.
“AI is unlikely to innovate on its own in a closed scenario,” said Brian Steele, vice president of product management at Gryphon.ai.
But it's unclear how widespread open AI practices will become: GitLab Chief Product Officer David DeSanto said few AI vendors are willing to release information about where their training data comes from, how their enterprise customers' data is used in these models, or whether the data is retained according to customers' wishes.
“Many companies don't want their intellectual property to be used to train or improve models, or for any purpose other than providing output from the models,” DeSanto said. Additionally, security experts are concerned that AI services could experience data breaches, putting sensitive intellectual property data at risk, he noted.
Characteristics of Closed AI and Open AI
In traditional software development, the distinction between open and closed source has focused on licensing restrictions on code reuse. Srinivas Atreya, principal data scientist at Cigniti Technologies, an IT services and consulting company, explained that the line between open and closed AI is more of a spectrum than a binary division.
“Many organizations are adopting a mixed approach, sharing some elements openly while keeping others private,” Atreya says. For example, a company might open-source its model code but keep training data and details of the model architecture private. He believes we'll increasingly see researchers sharing their AI breakthroughs in a collaborative way. Disclosure or phased release to make the model and related resources public in a controlled manner that takes into account safety and societal impacts.
When thinking about the difference between closed and open AI, Atreya believes it’s useful to consider three key aspects:
- model. Closed AI models are kept private by the organizations that develop them. These organizations typically keep details about the model's structure, training process, and proprietary information secret to protect their investment and stay competitive. In contrast, open AI models are published, which means their architecture, parameters, and often the trained model itself are shared openly so that other researchers and developers can use, learn from, and build on them.
- data. AI models are trained on vast amounts of data. In closed AI, details of the training data (such as where it comes from, how it's collected and cleaned, and its composition) are often kept secret. This may be for privacy, legal, or competition reasons. In open AI, there is typically more transparency about the data used for training. Open data can be inspected and scrutinized by others to verify its quality, bias, and appropriateness for the model's intended use.
- code. Code refers to the software used to train and run an AI model. In a closed AI approach, the code is kept private. In an open AI approach, the code is made publicly available, often as open source software, so that others can inspect the code to understand how the AI is trained and operates, and to modify and improve the code.
Advantages of Closed AI
According to AI experts interviewed for this article, a closed AI approach offers the following advantages:
- Faster development cycles. Part of closed AI has faster cycles that improve the security and performance of models, according to Tiago Cardoso, group product manager at Hyland.
- ease of use. Closed AI vendors often offer infrastructure and support services to speed up the adoption of enterprise apps that connect to their models.
- Possible licensing flexibility. According to Gryphon.ai's Steele, closed AI systems avoid the legal issues and restrictions on reuse that surround open source systems.
- Commercial interests. Nick Amabile, CEO and chief consulting officer at digital transformation consulting firm DAS42, says a closed model gives companies an advantage in commercializing innovation: It encourages continuous and rapid development of new applications and features by commercial software vendors.
- Control has been enhanced. The process of sharing internal libraries can slow innovation: “When you manage your users and all the different systems that depend on your code, it's much easier to evolve those systems,” says Jonathan Watson, CTO at Clio.
The benefits of Open AI
According to AI experts interviewed for this article, Open AI will bring the following benefits to the industry:
- Increased surveillance. Open AI allows the larger community to identify and mitigate problems. “Outside knowledge is often better than inside knowledge, and the community can help advance the mission, find and solve problems, and create impact in the tech ecosystem,” Watson says.
- Better recruitment. Sharing AI innovations openly attracts prominent developers who want to innovate on the cutting edge. “The open source model gives teams a clear sense of what they're trying to do and what they can do, and people are drawn to that,” Watson says.
- Deeper understanding. According to Cardoso, open models provide essential information to properly understand what to expect from the model. The model's architecture is fully known and all weights are provided, so it can be replicated. This allows knowledgeable organizations to adapt, optimize or apply the model in innovative ways.
- Identifying bias. Open models publish training sources, allowing researchers to understand sources of bias and participate in improving the training data.
- Faster scalability improvements. The open model can benefit from collective innovation that accelerates scalability improvements, and open source communities tend to be quicker to optimize for cost and scalability, especially when backed by large organizations, as is the case with Meta's Llama, Cardoso noted.
Framework for stakeholder discussions
Each company will have its own approach regarding the openness of the AI capabilities they use and how they offer new services and products. It is important to consider different aspects related to trade-offs between ethics, performance, explainability, and intellectual property protection in discussions with company and community stakeholders.
Cardoso said vendors of closed models tend to invest more resources into security and AI tuning efforts. He also found that closed models tend to perform better. However, the community elements of open AI are often able to fill these gaps to certain extents. One example is the emergence of fine-tuning on smaller model versions, which allows for cost-effective, domain-specific models that can provide better performance in those domains.
Steele said open AI makes sense for companies that want to benefit from the default behavior of AI applications and don't have data privacy or usage risks, but if a company wants to extend the AI or create its own datasets, closed AI makes more sense.
DeSanto recommended that organizations start by facilitating conversations between technical AI teams, legal teams and AI service providers.
“Establishing a baseline and developing a common language within your organization will go a long way in determining where to focus with AI and minimize risk,” he said.
From there, organizations can begin to set appropriate guardrails and policies for their AI implementations, including policies around how employees can use AI as well as guardrails such as data sanitization, in-product disclosure, and moderation features.
Editor's note: This article was updated in July 2024 to improve reader experience.
George Lawton is a London-based journalist who has written more than 3,000 articles over the past 30 years on his areas of interest, including computers, communications, knowledge management, business and health.