Building on a large underlying model, the Large Language Model (LLM) demonstrates general capabilities that can perform a variety of tasks that were previously impossible. However, further fine-tuning of such LLMs is required to improve performance for specialized domains or jobs. The general steps for fine-tuning such a large model are:
- Continuous pre-training in niche areas ensures that a broad base model can acquire expertise in such areas.
- Instruction tuning to train a large general-purpose base model to understand and execute specific types of natural language instructions.
- We use RLHF (Reinforcement Learning with Human Feedback) to train a large base model with the required conversational ability.
Several large models are already pre-trained and publicly available (GPT-J, Bloom, LLaMA, etc.), but there is no public toolbox that can efficiently perform fine-tuning operations across all these models.
To help developers and researchers efficiently fine-tune and infer huge models with limited resources, a team of academics from the University of Hong Kong and Princeton University has created an easy-to-use, lightweight toolset.
One Nvidia 3090 GPU and 5 hours can train a custom model based on the 7 billion parameter LLaMA model. The team used this framework to fine-tune his versions of LLaMA with 7, 13, 33, and 65 billion parameters on a single machine before providing model weights to academic research. .
There are four steps to optimizing the output of large language models that are freely available online:
- The first step, “domain adaptation”, involves training the model to better handle a particular domain.
- Adapting the task is the second step and involves training the model to achieve a specific goal such as summarization, question answering, or translation.
- The third stage of fine-tuning the instructions is to adjust the parameters of the model based on the question-answer pairs of the instructions.
- The final step is reinforcement learning using human feedback. This includes refining the model based on people’s opinions.
LMFlow provides a complete fine-tuning procedure for these four steps, allowing independent training of huge language models despite limited computational resources.
LMFlow offers a radical fine-tuning approach to large-scale models with features such as continuous pre-training, instruction tuning, RLHF, and an easy and flexible API. With LMFlow, everyone now has access to individual model training. For activities such as answering questions, interacting, writing, translating, and consulting with experts on various subjects, each person can choose the appropriate model based on the resources available to them. Training over a longer period of time will yield better results if the user has a sufficiently large model and dataset. The team recently trained his 33B model to outperform ChatGPT.
please check out paper and github link.don’t forget to join 25,000+ ML SubReddit, Discord channeland email newsletterShare the latest AI research news, cool AI projects, and more. If you have any questions regarding the article above or missed something, feel free to email me. Asif@marktechpost.com
🚀 Check out 100’s of AI Tools at the AI Tools Club
Dhanshree Shenwai is a computer science engineer with extensive experience in FinTech companies covering the fields of finance, cards and payments, and banking, with a strong interest in AI applications. She is passionate about exploring new technologies and advancements in today’s evolving world to make life easier for everyone.
