Shortage of LLM Developers Affecting AI Ecosystem

The need for skilled Large Language Model (LLM) developers is a multifaceted problem influenced by many factors.

LLM is a relatively new technology and is still in the early stages of adoption. This means that the pool of developers with experience in this area is limited.

The total cost of training a large language model is huge and increases as the model grows.

Although recent advances in computational technology and data infrastructure have reduced the cost of training LLMs, they still require enormous computational resources and are prohibitively expensive.

“Virtually only a handful of companies have the infrastructure and resources to train LLMs,” explains Activeloop CEO David Buniatyan. “As a result, we have only a few developers working on these projects at work.”

However, trends are changing, LLM training techniques are becoming more efficient, allowing developers to train and run small LLMs with far fewer computational resources.

Themos Stafylakis, Head of Machine Learning and Voice Biometrics at Omilia, said the release of ChatGPT caused a boom and sparks around the world. All companies are now talking about AI and are trying to understand and integrate it in some way.

LLM developers need training in several areas, especially machine learning, such as tokenization, token embedding, transformers, encoder/decoders, decoder-only architectures, autoregressive models, adapters, and policy-based reinforcement learning, he said. Stated.

Additionally, it should be possible to train very large architectures using multiple GPUs (a method of parallel processing) and create datasets that reflect how LLM is used.

“Finally, they need to be able to do rapid engineering,” he said. “It’s hard to find a developer with all these skills.”

LLM Driving Generative AI Tools

According to Seth Robinson, CompTIA’s vice president of industry research, large language models are the engines that drive generative AI tools on the market.

“Software developers will be trained on specific algorithms on how these large-scale language models work using a technology called transformers,” he said. “Even if an organization is looking for people who can develop language models at scale, such skills are probably in relative short supply because this is a very new technology.”

On the other hand, the demand for these developers may not be as steep as some organizations think, he said.

“The business side might say, ‘We need to build a large language model,'” Robinson said. “This is probably not an approach that many organizations would invest in their own large-scale language models.”

Buniatyan added that academia is focused on teaching individuals in data science, computer vision, and natural language processing.

“But more recently, universities have focused on LLM training beyond dedicated laboratories,” he said.

For example, according to Cornell University’s arXiv service, 45.51% of all large language model papers (out of a total of 10,729) have been published in the last 12 months.

“Universities seem to be adapting quickly by releasing short courses on large-scale language models,” Buniatian said. “Still, education efforts are largely industry driven.”

This includes efforts such as Activeloop, TowardsAI, and the Foundational Model Certification program released in collaboration with the Intel Disruptor Initiative.

“Building applications with LLM is not trivial. The specific challenges are related to rapid engineering and dealing with natural language processing ambiguities,” says Buniatyan.

Requires a lot of effort and investment

Developing a “glossy demo” with LLM is easy, but making it production-ready requires a lot of effort and resources.

According to Buniatyan, universities are beginning to focus on providing students with the skills necessary to bring LLM into production, but these efforts are still in their early stages.

“Developing a GPT-3 type model could easily reach $5 million in computing costs per week, and training smaller versions would still cost millions of dollars,” he said. Stated.

Universities will therefore aim to develop the best talent in efficient high-performance computing (HPC).

“It starts with learning about data structures and various HPC programming techniques at university,” explained Buniatyan.

The University of California, Berkeley recently announced new colleges in computing, data science, and social studies. The college is focused on preparing students for a world powered by AI.

Robinson pointed out that every institution has some sort of software development curriculum, and most of those curricula remain relevant.

“For someone who specializes in large language models, it’s going to be the final stage, or an advanced level after a great deal of software development, training and curriculum,” he said.

Universities and institutions now offer courses in machine learning and data science/engineering and are gradually adapting to incorporate LLM and Transformers, Stafylakis added.

“Efforts in AI began a long time ago for some, but the boom has happened so quickly that we expect these universities to keep pace and strive to increase their offerings in the field. I do,” he said.

Alamy

Evolving Job Market for LLM Developers

As the LLM field continues to advance, the job market for LLM developers is expected to evolve as well.

“Given the increasing adoption of LLM in various applications, the demand for LLM developers will likely increase,” said Buniatyan. “Furthermore, as LLMs become more specialized and used for more complex tasks, the skill sets required of LLM developers will also evolve.”

He expects the demand for immediate engineers to increase in the short term, noting companies like Anthropic are hiring immediate engineers at a base annual salary of $280,000 to $375,000.

Another near-term demand is expected in reinforcement learning from human feedback (RLHF) experts across disciplines (law, medicine, etc.).

“These domain experts will be engaged in providing feedback on the LLM output in order to improve it,” Buniatyan said.

However, although the LLM sector is booming, some challenges could affect the future job market, he added.

“The high computational costs associated with training these models and the relative shortage of developers with the necessary expertise may limit the number of jobs available in this area,” he warned. .

Acquiring top LLM talent comes at a price

Stafylakis said he believes there are four ways organizations can attract top LLM developer talent in today’s world.

“Having a consistent and challenging vision for LLM is a top priority on the list because it plays a key role in inspiring developer talent,” he explained.

Additionally, companies must have a means of providing good, clean data or collecting it. Without this, the LLM’s job is primarily to work backwards to move forward.

“If you’re serious about making progress in the AI space, you need a strong MLOps team to support these LLM developers.

Finally, despite the tough economic environment, the LLM developer role requires a tremendous amount of skill, so it’s important to recognize this and offer a competitive salary.

“My outlook is that the LLM developer job market will continue to thrive for several years because companies developing and offering their own LLMs is the way to stay competitive in this competitive market. Because we recognize that there is,” Stafylakis said.

About the author

Nathan Eddy is a freelance writer for ITPro Today. He contributes to his Popular Mechanics, Sales & Marketing Management Magazine, FierceMarkets, CRN among others. He made his first documentary film, The Absent Column, in 2012. He currently lives in Berlin.

Source link