Russian tech giant Yandex has released a new open-source tool that helps AI companies save costs and resources when training new models.
Yandex said in its announcement that the new YaFSDP tool offers faster and more efficient training of large language models (LLMs), saving up to 20% in GPU resources and saving costs and time.
Yandex has published resources for trying out YaFSDP on GitHub for anyone to view and experiment with.
How to train for LLM
Yandex goes into a lot of technical details in their announcement post, saying that their new open-source tool is the “most effective publicly available tool for enhancing GPU communication and reducing memory usage in LLM training,” and that it can deliver speedups of up to 26% over YaFSDP's previous version, FSDP.
“LLM training is a time-consuming and resource-intensive process,” Yandex said. “Machine learning engineers and companies developing their own LLMs invest significant time and GPU resources (i.e. money) in training these models. The larger the model, the greater the time and costs associated with training it.”
The company estimates that using YaFSDP to train a model with 70 billion parameters could save the resources of around 150 GPUs, translating into monthly savings of around $500,000 to $1.5 million (depending on the virtual GPU provider or platform) if all goes well.
Yandex says that using Meta's Llama model, which is highly regarded for being cutting edge while still being open source, the “net speedup” is 21% and 26% for Llama 2 70B and Llama 3 70B, respectively.
YaFSDP is not Yandex's first foray into AI tools, having previously released CatBoost, YTsaurus, AQLM, and Petals, all of which address different niches within AI development.
Yandex's move comes after the company abandoned its operations in Ukraine after Russia's invasion and sold its domestic operations to its former management.
Many major tech companies are also building AI into the foundation of their products, most recently with Apple releasing its Apple Intelligence service as part of its iOS 18 update.
