One of the reasons generative AI applications such as ChatGPT and Midjourney are so popular is their extensive knowledge base. For businesses, however, the next step in adopting generative AI appears to be smaller, more flexible systems tailored to enterprise IT architectures and use cases.
It became clear earlier this week that organizations are rushing to adopt generative models. MIT Technology Reviewconference, EmTech Digital 2023. But for these enterprise deployments to succeed, they need to understand how to integrate generative AI into their existing IT architecture in a secure and customized way.
Access to Infrastructure Limits Widespread Adoption of AI
As interest in generative AI grows, supporting model training and operations becomes challenging at the hardware and infrastructure level.
Large models with billions of parameters, such as GPT-4, require highly optimized underlying infrastructure, which can be costly and difficult to build. “It’s very expensive and difficult to build on a new system. [are] These memory systems can deliver data at high capacity and speed,” said Danner Stodolsky, senior vice president of cloud at enterprise AI platform vendor SambaNova Systems.
The challenge for many organizations is balancing security and compliance needs with the compute required to run generative AI at scale. According to Stodolsky, SambaNova’s current customers are primarily interested in hosting systems in their own data centers, typically for privacy and security reasons, but more recently there has been an increased interest in the cloud. I’m here.
Dror Weiss, co-founder and CEO of AI coding assistant vendor Tabnine, said: “Everybody wants a GPU. They want hundreds of GPUs. It’s the fastest, but that doesn’t meet the security requirements of every customer we work with.”
Deploying a private cloud can be an option for companies with strict compliance requirements that prohibit sending data to external parties. For example, Weiss says users can run Tabnine’s AI prediction server in a virtual private location within his cloud, corporate network, or even on their own hardware with the right GPU.
Low latency can also be a strong argument in favor of local deployment for certain use cases. In the session “Off the Shelf AI” Prabhdeep Singh, Vice President of Software Products and Engineering at SambaNova, gave an example of analyzing data from a 4K camera for defect detection on an assembly line. .
“The amount of data coming in and the inference that needs to be done in near real time puts a lot of strain on these systems,” Singh said in his presentation. “We just don’t have enough bandwidth to send data to the cloud, so we need an on-premises system that can do this at scale and quickly.”
Evaluating Open Source and Proprietary Generative AI
Partly because of these infrastructure challenges, more and more developers and researchers are supporting open source AI models.
In line with the classic argument for open source software, proponents of open AI hope that crowdsourced knowledge and input will lead to better models. Additionally, large-scale generative AI models are expensive to train and refine, making them out of reach for many smaller organizations. An open-source approach could make the power of generative AI available to users who don’t have the resources to train customized models themselves.
With open source, “you don’t have to manage the infrastructure of these complex generative models,” said Bill Marino, principal product manager at Stability AI, an open source generative AI company, during the session “AI One Future.” Instead, users can consume baseline models via API endpoints and fine-tune customized models with their own data.
Having a diverse set of open models can democratize access to generative AI and, in doing so, reduce bias in AI systems. Margaret Mitchell, Hugging Face’s chief ethics scientist, said in the same session, “Having diverse opinions about what we should prioritize and what values we should hold has helped me. It is very important in understanding which direction we should head.
By providing open access to the code and training data of AI systems, the reasons for discriminatory or harmful model output can be uncovered and ideally prevented. “By doing this openly, everyone gets more insight into why the models are making the decisions they’re making,” Marino said in an interview.
However, despite the advantages of the open source model, companies may need their own model. Generative AI is so new to many companies that being able to experiment with minimal risk is critical to successful adoption.
“Let’s be honest here, no one really knows how these things will work for individual companies,” Singh said in his presentation. “So you need a safe environment and sandbox first. Try them out and see where these things really come in handy.”
Conducting such explorations using consumer-oriented tools carries security and privacy risks. For example, when interacting with ChatGPT, all inputs become part of the training data. This is an unacceptable risk for enterprise users who have already experienced the negative effects of using sensitive information in prompts.
“We see all these issues with data leakage and provenance,” said Stodolsky. “There are many benefits to controlling the risk, provenance, and understandability of a system.” He says you can be confident that you won’t infringe copyright.
Small model fine-tuned for enterprise use cases
In addition to addressing security and compliance issues, models fine-tuned for individual companies can also address the issue of inadequate infrastructure to support the deployment of generative AI.
Rather than using giant models with broad knowledge bases such as GPT-4, organizations are beginning to look at smaller, lighter systems that are trained and specialized in specific domains. “We believe the big opportunity that is obvious to everyone lies in small models that are fine-tuned and focused on specific use cases,” Marino said in his presentation.
A more generalized model has advantages such as initial exploratory work in new domains. However, they are often not suitable for enterprise use cases such as answering detailed questions in contact centers or generating content for marketing campaigns.
This is due, in part, to stricter corporate requirements for model output accuracy. An individual using his ChatGPT for personal purposes may be frustrated by inaccurate responses, but a corporate model that fails on tasks such as customer service and quality control is a real financial and It can cause reputational impact.
By building narrowly tuned models that are easier to fine-tune and evaluate, “we give people and businesses more confidence. [and] Understandable and low risk,” Stodolsky said.
Using tools such as ChatGPT can make it difficult for users to recognize factual or logical errors because the generation AI “generates wrong answers in a different way than we’re used to.” However, when a small system with a well-defined purpose starts returning responses outside the intended domain, users are more likely to realize that the system is “hallucinating”.
Additionally, the narrower model is more agile and fits more easily into existing enterprise infrastructure, making it easier to focus on your organization’s data. This is a key component of enterprise adoption. “For large customers, this degree of control, scalability and flexibility is important,” Weiss said.
Therefore, a promising option for companies is to customize the baseline model to fit their needs and workflow. For example, SambaNova gives customers access to open source models that companies can train with their own data to create models that more closely match their use cases.
Similarly, Weiss said Tabnine is working on features that will allow customers to connect their codebases to Tabnine to build more customized versions of their AI coding assistants for enterprise users. “They bring their private code into Tabnine and we provide more professional code suggestions this way,” he said. “For us, the Private Code Front is especially exciting.”
Moving forward, the safe and effective adoption of generative AI for the enterprise will require careful consideration of the use cases and risks of each business. “Everyone is excited about these generative models right now, and there is no doubt about it,” Marino said. “But for many companies and other integrators of these models, what I am thinking now is the process of undergoing this in-depth evaluation.”
