Join top executives in San Francisco July 11-12 to hear how they are integrating and optimizing their AI investments for success. learn more
The era of Generative AI and Large Language Models (LLM) has spawned a new category of tools known as LLMOps to support user needs.
San Francisco startup Weights and Biases announced a major update to its MLOps platform this week to enable LLMOps. In LLM-based operations, organizations and users typically don’t build entirely new models. Rather, prompts are often tweaked and used to produce desired results. The need to support that use case is behind today’s release of the W&B Prompts feature on the Weights and Biases platform. This new feature includes features that help the user quickly build his LLM-based application with a series of chained prompts leading to optimized output.
Lukas Biewald, CEO and co-founder of Weights and Biases, said at a livestreaming user meetup from London: “We broadly define a machine learning practitioner as someone who tries to make machine learning models work in the real world.”
The Path from Machine Learning to Rapid Engineering
Since 2017, W&B has built the MLOps platform and evolved it to meet the changing needs and types of users.
event
transform 2023
Join us July 11-12 in San Francisco. A top executive shares how she integrated and optimized her AI investments and avoided common pitfalls for success.
Register now
According to Biewald, the first thing the company built was a feature called experiments designed to help machine learning engineers track their experiments. This first feature helped track all the models an organization was building and understand how they progressed or retreated over time.
From the outset, W&B has expanded the platform to include model parameter optimization, reporting capabilities to help developer groups collaborate, and an advanced set of features for tracking deliverables and managing and deploying model workflows. bottom.
Rapid engineering has increased in recent months. The catalyst for this change is the organization’s increasing reliance on his LLMs from vendors such as OpenAI and Cohere, rather than trying to build their own entirely proprietary models.
“Prompt engineering is the most common way to use language models at scale today. You just have to figure it out,” says Biewald.
In the past, it took data scientists and machine learning engineers considerable time and effort to apply sentiment analysis to datasets, says Biewald. In the age of LLM, performing sentiment analysis is as easy as giving the right prompts.
“The market has just expanded massively, and I think every software developer, maybe everyone now, can be a machine learning practitioner,” he said. “Everybody can use machine learning models for real-world applications without much training.”
The new W&B Prompts tool fits the new LLMOps landscape by enabling companies to create accurate and effective prompts for complex tasks.
In a series of rapid-fire demos, Biewald showed what the new tools could do. First, there was a set of debugging tools that prompt engineers could use to track down, trace, and debug potential errors within a prompt chain (a series of prompts). Use prompt chains together or in succession for ideal results.
LangChain, a framework for developing language model-based applications, has also been integrated into W&B Prompts. For OpenAI-based LLMs, W&B provides integrated support for scoring efficacy prompts using the OpenAI Evals framework.
“We can see how well different models are doing, and hopefully we can see if the model is improving or deteriorating as we change the prompts,” says Biewald.
Mission of VentureBeat will become a digital town square for technical decision makers to gain knowledge on innovative enterprise technology and trade. Watch the briefing.