Researchers at Los Alamos National Laboratory have developed a tool that allows artificial intelligence to detect instances that describe objects that are not present in an input image with high accuracy. Prelim Attendant Score (PAS) provides a unique solution by monitoring an autoregressive visual language model during output generation, pinpointing the source of information and potential errors in the process itself. “PAS is a real-time, plug-and-play metric that acts as an internal monitor for AI,” explains Los Alamos computer scientist Manish Bhattarai. Unlike many detection methods, PAS can be quickly integrated into existing systems with minimal computational overhead, providing developers with a practical path to more reliable multimodal AI, with potential applications ranging from medical image processing to scientific document analysis.
Detecting hallucinations in visual language models using interim attention scores
A new metric developed at Los Alamos National Laboratory provides an important step toward mitigating the pervasive problem of hallucinations in visual language artificial intelligence systems. Prelim Attention Score (PAS) evaluates in real time whether the model’s output matches the provided image or comes from internally generated and potentially inaccurate text. These autoregressive visual language models, which generate responses token by token, are increasingly used for tasks that combine visual and textual data, but they tend to fabricate details that are not present in the original input. The researchers approached this problem by focusing on how these models construct answers and identifying key points for error detection in the generation process itself.
The PAS system works by monitoring the predictions of each token by a visual language model, revealing its source, and determining where hallucinations are most likely to appear. It leverages the attention patterns inherent in Transformer architectures, a popular deep learning approach, to analyze how the model weights information from images, initial text prompts, and your own development output. This design choice lowers the barrier to adoption for developers who want to improve the reliability of their models. The Los Alamos team demonstrated that PAS has increased accuracy in detecting these visual discrepancies, providing quantitative improvements over existing methods.
The system generates a score indicating the likelihood of hallucinations. Values closer to 0 indicate stronger grounding in the input image. “By understanding how visual language models pay attention to prior information, PAS can help identify exact instances where the model starts to rely too much on its own words,” said Los Alamos intern Xuan Nhat Hoang, highlighting the tool’s ability to pinpoint the source of errors. Potential applications span many fields, including medical imaging, scientific document analysis, and remote sensing. In these fields, accurate visual claims are paramount for informed decision-making. The team plans to present their findings at the Computer Vision and Pattern Recognition Conference this month.
By understanding how visual language models pay attention to prior information, PAS can help identify exact instances where the model starts to rely too much on its own words.
Xuan Nhat Hoang, Los Alamos Intern
Transformer architecture enables real-time hallucinatory monitoring with PAS
Autoregressive visual language models are increasingly deployed in applications ranging from medical imaging to remote sensing, and generate output in stages. This characteristic presents both opportunities and challenges for detecting inaccuracies. This sequential generation process is built on a transformer architecture and enables real-time monitoring of information sources. Rather than identifying hallucinations after a response is complete, developers can now pinpoint the moment when a model begins to deviate from verifiable data. The ability of this system to work with existing models is a major advantage as it avoids the costly and time-consuming process of completely redesigning the system. PAS works by calculating attention-based scores by examining how these transformer-based models respond to various information streams, initial images, provided text prompts, and outputs during the model’s own development. This score quantifies the degree to which the model bases its response on the images provided, rather than building up details from internal parameters.
Scores closer to 0 indicate greater dependence on the input images and less likelihood of hallucinations, while higher values indicate potential mismatches. The research team reports that PAS achieves improved accuracy in capturing hallucinations, positioning it as a key solution for increasing the reliability of multimodal AI systems.
PAS is a real-time, plug-and-play metric that acts as an internal monitor for AI.
Manish Bhattarai, computer scientist at Los Alamos
