
Language learning models (LLMs) are very good at making inferences and coming up with good answers, but they can be honest about their mistakes and tend to hallucinate when asked questions they've never seen before. When responses are more than a single token, it becomes even more important to determine how to obtain reliable confidence estimates from LLMs.
Both training-based and prompt-based approaches have been used to derive confidence from LLMs in the past. For example, prompt-based approaches use specific prompts to create confidence ratings or check the consistency of answers as an indicator of confidence. To build confidence in the LLM, training-based methods create customized datasets for calibration. However, these techniques often yield subideal or simplified confidence estimates that do not faithfully represent the model's degree of certainty.
A new study by Purdue University, University of Illinois at Urbana-Champaign, University of Southern California, and Hong Kong University of Science and Technology introduces SaySelf, a framework for training LLMs to help them make confidence estimates with increased precision and accuracy. Importantly, unlike previous studies, SaySelf allows LLMs to indicate where their knowledge is lacking and provide a self-reflective rationale to explain their confidence estimates. To achieve this, the researchers use off-the-shelf LLMs (such as GPT4) to automatically generate datasets tailored to their models, which can then be used for supervised fine-tuning. For each query, the researchers obtain several random samples of inference chains from the LLM, which are sequences of tokens that represent the LLM's thought process. The inference chains are then grouped into clusters according to their semantic similarity, and one example from each group is kept.
From a first-person perspective, GPT-4 is asked to examine selected cases from different clusters and summarize in plain language the uncertainty about the particular knowledge. The researchers use reinforcement learning to tune the LLM's confidence estimate for each response to ensure accurate confidence estimates. They devise a payment system to prevent the LLM from making overconfident predictions and punish it when it is wrong. Experiments in this study evaluate SaySelf using a variety of knowledge-rich question-answering tasks, including complex medical diagnosis and legal case analysis. The study demonstrates that SaySelf significantly reduces confidence calibration errors while maintaining task performance. The developed self-reflective rationale allows for further improvement in calibration performance, which also successfully captures internal uncertainty.
The following examples are incomplete as to how this work may impact relevant academic research and practical applications: (1) From the perspective of LLM alignment, AI can benefit from transparent trustworthiness statements with explanations; (2) LLMs can improve their interactions and performance by following self-reflective rationales for performing further activities such as requests for external tools or clarification inquiries.
Once the training process for SaySelf is complete, the team hopes to see promising advancements in the training procedures, such as proactive learning algorithms that improve LLM learning outcomes through interactions with people.
Please check paper. All credit for this research goes to the researchers of this project. Also, don't forget to follow us: twitter. participate Telegram Channel, Discord Channeland LinkedIn GroupsUp.
If you like our work, you will love our Newsletter..
Please join us 43,000+ ML subreddits | In addition, our AI Event Platform

Dhanshree Shenwai is a Computer Science Engineer with extensive experience in FinTech companies covering the domains of Finance, Cards & Payments, Banking and has a keen interest in the applications of AI. She is passionate about exploring new technologies and advancements in today's evolving world that will make life easier for everyone.
