shop talk
/ō-pən-wä-shĭng/
Accusation of some AI companies for using the label “open source” too loosely.
This article is part of Shop Talk, a regular feature that explores the business world's idioms: jargon, newly coined terms, unfortunate or overused phrases.
There's a big debate in the technology world over whether artificial intelligence models should be “open source.” Elon Musk, who helped found OpenAI in 2015, sued the company and its CEO Sam Altman, accusing the company of straying from its mission of openness. The Biden administration is examining the risks and benefits of open source models.
Proponents of open source AI models argue that they are fairer and safer for society, while opponents say they are more likely to be exploited for malicious purposes. Was there a big problem with the discussion? There is no agreed-upon definition of what open source AI actually means. Others have accused AI companies of “open-washing,” using the term “open source” dishonestly to make themselves look good. (Accusations of open washing have previously been leveled at coding projects that used the open source label too loosely.)
In a blog post for Open Future, a European think tank that supports open sourcing, Alec Tarkowski wrote, “As the rules are written, there will be sufficient guardrails against companies' attempts at 'open washing.' One of the challenges is to build a “This 'openwashing' trend threatens to undermine the very premise of openness: the free sharing of knowledge to enable inspection, replication, and collective progress,” said a nonprofit organization that supports open source software projects. “There is,” he warned.
Organizations that apply labels to their models may have very different approaches to openness. For example, his OpenAI startup, which launched his ChatGPT chatbot in 2022, discloses little about its model (despite its company name). Meta labels the LLaMA 2 and LLaMA 3 models as open source, but places restrictions on their use. Most open models are primarily run by nonprofit organizations and use open source licenses that make the source code and underlying training data public and enable widespread reuse. But even with these models, there are obstacles for others to reproduce them.
The main reason for this is that while open source software allows anyone to copy and modify it, building an AI model requires more than just code. Only a handful of companies can fund the necessary computing power and data curation. There's a reason some experts argue that calling AI “open source” is misleading at best and a marketing tool at worst.
“Even the most open AI systems do not allow open access to the resources needed to ‘democratize’ access to AI or enable full oversight.” , said David Gray Weider, a postdoctoral fellow at Cornell Tech who studies the use of “open source.” A label by an AI company.
Efforts are underway to create a clearer definition of open source AI. In March, researchers at the Linux Foundation announced a framework for classifying open source AI models into different categories. Another nonprofit, the Open Source Initiative, is also working on drafting a definition.
But Weider and others question whether true open source AI is even possible. The exorbitant resource requirements for building AI models “will never go away,” he said.