![]()
The potential for generative artificial intelligence (AI) seems endless, but the computing power it requires can be limited. One estimate puts a ChatGPT query costing him 1,000 times the cost of the same question on his regular Google search. In the early stages of development, companies such as OpenAI are trying to generate public benefits, so even if in a month he added 100 million active his users, this would be at an acceptable cost. There is a possibility.
![]()
However, for a more general purpose product, this kind of expense could easily become unsustainable. Even the White House has addressed the issue, noting the potential environmental impact of increased energy consumption and data center space required for augmented generative AI applications.
Addressing underlying costs
Before dealing with the cost of running a Large Language Model (LLM), most companies interested in developing their own generative AI solution will face the cost of training it. Training generative AI requires time to own or rent hardware, requires large amounts of data storage, and requires intensive energy consumption. It cost over $5 million just to train OpenAI’s GPT-3 (the version before it was adopted by ChatGPT).
However, some progress has been made in lowering the barriers to entry into generative AI. One solution developed at the Massachusetts Institute of Technology (MIT) claims to reduce the cost of training LLMs by 50%. Plus, with a more efficient training method, LLM training takes half the time.
Work smarter, not bigger
Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are also raising the possibility of smaller, specialized LLMs as a way to reduce costs and improve efficiency. Restricting the dataset used by the model not only allows us to perform better than models with 500x more parameters, but also addresses privacy and accuracy issues.
The researchers employed a model in LLM that could simplify how potential responses were calculated. This is done by comparing hypotheses, or probable statements, against assumptions, or known facts. Not only does the model rely on smaller, specialized datasets, but it also requires less training to get accurate results, according to the researchers.
