Stanford University Researchers Propose EVAPORATE: A New AI Approach That Reduces Language Model Inference Costs by 110x

Source: https://arxiv.org/pdf/2304.09433.pdf

Large language models are constantly making headlines today. New research papers and new updates to LLM are released almost daily, with extraordinary features and applications in various fields. Current LLMs have a huge number of parameters, which makes the training cost very high. They are trained on trillions of tokens, so they are very expensive.

In a recently published research paper, some Stanford and Cornell students proposed ways they could meet the challenges of expensive LLMs. The team shared how language models (LM) are costly when processing large documents. They cited an example of the cost of running reasoning across his 55 million pages on Wikipedia. This is over $100,000 and equates to a price of over $0.002 per 1000 tokens. The approach proposed by the authors can reduce the inference cost by a factor of 110 while improving the quality of results compared to performing inference directly on each document.

An LLM called EVAPORATE will enhance this prototype system and identify two different strategies for implementing the system. The first strategy is to tell LLM to extract the value directly from the document. The second is to tell LLM to synthesize the code that performs the extraction. The team evaluated these two approaches and found a trade-off between cost and quality. Code synthesis was cheaper, but less accurate than having LLM process each document directly.

🚀 Check out 100 AI Tools in the AI Tools Club

EVAPORATE identifies redundancies in multiple documents and exploits them to improve efficiency. The team illustrated this with an example of extracting device classification attributes from an FDA report for a medical device. Instead of processing all semi-structured documents with LLM, the author considers using her LLM to generate functions that can be reused to extract from all documents.

To improve quality and keep costs low, the team proposed an enhanced code synthesis implementation called EVAPORATE-CODE+. This approach generates many candidate functions and uses weak supervision to ensemble their extraction. Traditionally, weak supervision was applied to human-generated functions, but EVAPORATE-CODE+ works with machine-generated functions to address this setup challenge and enable quality improvements.

EVAPORATE has been evaluated on a set of 16 documents across different formats, topics, and attribute types. EVAPORATE-CODE+ outperforms SOTA systems by using a sublinear pass through the document using LLM. The result is a 110x reduction in the number of tokens that LLM needs to process (averaged across 16 evaluation settings of 10,000 documents each).

In conclusion, this paper presents a promising approach for automating the extraction of tables from semi-structured documents using LLM. By identifying trade-offs between direct extraction and code synthesis and proposing enhanced implementations that achieve better quality while maintaining low cost, this work will certainly move forward towards the data management community.

check out paper and reportdon’t forget to join Our 20k+ ML SubReddit, cacophony channeland email newsletterWe share the latest AI research news, cool AI projects, and more. If you have any questions about the article above or missed something, feel free to email me. Asif@marktechpost.com

🚀 Check out 100 AI Tools in the AI Tools Club

Tanya Malhotra is a final year student at the University of Petroleum and Energy Research in Dehradun with a Bachelor of Science in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
A data science enthusiast with good analytical and critical thinking, she has a keen interest in learning new skills, leading groups, and managing work in an organized manner.

Source link