Learning an unmasking policy for a diffuse language model

Diffuse (large) language models (dLLMs) are now comparable to the downstream performance of autoregressive language models in many tasks, but are expected to be more efficient during inference. One of the key design aspects of dLLM is the sampling procedure that selects the tokens to unmask at each diffusion step. Indeed, recent work has found that heuristic strategies such as confidence thresholding improve both sample quality and token throughput compared to random unmasking. However, such heuristics also have drawbacks. Manual tuning is required, and performance has been observed to degrade as block size increases. In this study, we instead propose to use reinforcement learning to train the sampling procedure. Specifically, we formalize masked diffusion sampling as a Markov decision process in which the dLLM acts as the environment, and propose a lightweight policy based on a single-layer transformer that maps the confidence of dLLM tokens to unmasking decisions. Our experiments show that these trained policies match the performance of state-of-the-art heuristics when combined with semi-autoregressive (block) generation and outperform in a fully diffuse setting.

* Equal contributor
† University of Amsterdam
‡ Massachusetts Institute of Technology
** Work I did while at Apple

Source link

创建个人账户 commented on WestMetric Defends Controversial On-Page SEO Services for the Era of AI: Your article helped me a lot, is there any more re
Registro commented on Security Architect | eFinancialCareers: Thanks for sharing. I read many of your blog posts
Anm"al dig f"or att fa 100 USDT commented on Best ChatGPT Tips and Tricks shared by ChatGPT Experts: Turbo-Charge Your AI Experience: Prompts included | by Michael King | Oct, 2023: Thanks for sharing. I read many of your blog posts
Elizabeth Nash commented on AI platform Hugging Face says hackers have stolen authentication tokens from Spaces: 🌍 Global crypto mining is now at your fingertips h
Binance美国注册 commented on Meta’s Mark Zuckerberg on Threads, the future of AI, and Quest 3: Your article helped me a lot, is there any more re

Learning an unmasking policy for a diffuse language model

RECENT POSTS

Machine learning facilitates access to medicines in resource-limited healthcare settings

DAF Announces Broad Government Announcement on AI Applications in Next Generation C2

Neural4D launches AI 3D Agent, an integrated panel for 3D, image and video generation

Related Posts