Why Developers Flock To LLaMA, Meta’s Open Source LLM

When it comes to generative AI, the open source community is embracing Meta AI’s LLaMA (Large Language Model Meta AI), released in February. Meta made LLaMA available in several sizes (7B, 13B, 33B, and 65B parameters), but was initially restricted to approved researchers and organizations. But when it was leaked online for anyone to download in early March, it effectively became fully open source.

To understand how developers are using LLaMA and what advantages it offers over similar LLMs such as OpenAI and Google, we spoke with Sebastian Raschka of Lightning AI. rice field. He told me that developers are drawn to his LLaMA in his Meta because unlike GPT and other common his LLMs, LLaMA’s weights can be tweaked. This enables developers to create richer, more natural language interactions with users in applications such as chatbots and virtual assistants.

Raschka should know. His role at Lightning AI is that of “Lead AI Educator”, a combination of his academic background (formerly a college professor of statistics) and high-profile social presence in his media (with 192,000 followers). It reflects both. on Twitter I run a Substack newsletter titled Ahead of AI).

LLaMA vs. GPT: Release the Weight!

Raschka said LLaMA isn’t that different from OpenAI’s GPT 3 model, except that Meta shares weights.

In the context of AI models, “weights” refer to the parameters learned by the model during the training process. These parameters are saved in a file and used during the inference or prediction stage.

What Meta specifically did is release LLaMA’s model weights to the research community under a non-commercial license. Other powerful LLMs such as GPT are typically only accessible through limited APIs.

“So you have to access the API through OpenAI, but you can’t download the model or run it on your computer,” says Raschka. “Basically, custom can’t do anything.”

In other words, LLaMA is much more adaptable for developers. This can be very confusing for current leaders in LLMs such as OpenAI and Google. In fact, big companies are already worried, as revealed in an internal Google memo leaked this week.

“Being able to personalize language models in hours on consumer hardware is a big problem, especially for the desire to incorporate new and diverse knowledge in near real time.”

As LLM developer Simon Willison said, “OpenAI and Google continue to race to build the most powerful language models, but the work being done in the open source community is accelerating their efforts.” It casts a shadow on

Use Case

So what are some use cases for applications built on top of LLaMA?

Raschka said financial and legal use cases are good candidates for fine-tuning. However, he noted that larger companies may want to go beyond just fine-tuning and instead pre-train the entire model using their own data. So far, classification tasks such as toxicity prediction, spam classification, and customer satisfaction ranking are also popular.

According to Raschka, using LLaMA improves app performance and improves accuracy by 5% to 10% compared to traditional machine learning algorithms. In most cases, this can be achieved with only minor adjustments.

“It’s also something that people can access,” he said. “Because you don’t have to pre-train the model. Basically, you just fine-tune it.”

LoRA and other tools

One tool developers can use to fine-tune LLaMA is LoRA (Low-Rank Adaptation of Large Language Models), available for free on Microsoft’s GitHub account. I asked Raschka how this works.

He first said that there are various techniques for fine-tuning LLM, including hard tuning, soft tuning, prefix tuning, and adapter methods. He explained that the adapter method is attractive because the entire LLM can be trained while the rest of the transducer remains fixed. This results in smaller parameters and faster training times. LoRA is a type of adapter scheme that, according to Raschka, uses mathematical tricks to decompose large matrices into smaller matrices, resulting in fewer parameters and better storage efficiency. In practice, this means fine-tuning can be done much more quickly.

“Running a small-scale method that uses only these intermediate layers, like LoRA, basically takes 1-3 hours on the same dataset instead of 18 hours. Smaller parameters are an advantage. ”

Techniques like LoRA can help roll out LLM to multiple customers, he added. Because all you need is to store a small matrix.

Developers and tweaks

Tweaking is a step beyond rapid engineering, so I asked Raschka if developers need to learn how to do it.

Raschka believes that understanding how to use language models will be a useful skill for developers, but unless you have a very specific need, your company doesn’t need to be responsible for fine-tuning the models. For smaller companies, you can use common tools such as GPT, but for larger companies, he believes he has a team of his own responsible for fine-tuning the models.

Developers are definitely interested in implementing AI models into existing applications. This is where Raschka’s employer, Lightning AI, comes in. It provides an open source framework called PyTorch Lightning that is used to implement Deep His learning models. Lightning AI also provides cloud access to help users deploy machine learning systems in the cloud. By the way, the creator of PyTorch Lightning, William Falcon, had his Ph.D. His internship at Facebook AI Research in 2019 — may have influenced his support of LLaMA by Lightning AI.

Lightning AI also has its own implementation of the LLaMA language model called Lit-LLaMA, available under the Apache 2.0 license. Researchers at Stanford University also trained a fine-tuned model based on his LLaMA called Alpaca.

Conclusion

LLaMA seems like a great option for developers who want more flexibility with large language models. But as Raschka points out, tweaking is becoming more accessible, but it’s still a specialized skill not every developer needs to learn.

Whether fine-tuning or not, developers increasingly need to understand how LLMs can be used to improve specific tasks and workflows in their applications. Therefore, LLaMA is worth checking out. Especially since it’s more open than GPT and other popular LLMs.

Richard MacManus is senior editor for The New Stack, where he writes about trends in web and application development. Previously, in 2003 he founded ReadWriteWeb, making it one of the world’s most influential technology news sites. From the early days…

Why Developers Flock To LLaMA, Meta’s Open Source LLM

LLaMA vs. GPT: Release the Weight!

Use Case

LoRA and other tools

Developers and tweaks

Conclusion

Leave a Reply

RECENT POSTS

Sarvam AI and others in talks with Ministry of Defense to set up Rs 300-crore CoE for defense AI applications

Beyond Hollywood: How China’s AI video generator is hacking global culture

Deezer says AI-powered song uploads have almost surpassed human music.

LLaMA vs. GPT: Release the Weight!

Use Case

LoRA and other tools

Developers and tweaks

Conclusion

Related Posts

Leave a Reply