Self-Augmenting Retrieval for Diffusion Language Models
Paul Jünger ⋅ Justin Lovelace ⋅ Linxi Zhao ⋅ Dongyoung Go ⋅ Kilian Weinberger
Abstract
Diffusion language models offer fast, parallel decoding via non-autoregressive generation and uncertainty-aware denoising, yet these properties remain underexplored for retrieval. We propose *Self-Augmenting Retrieval for Diffusion Language Models*, a dynamic framework that uses intermediate diffusion states to refine retrieval throughout the denoising trajectory. At each iteration, we query an external corpus with the partially denoised text, retrieve additional evidence, and condition subsequent denoising steps on the updated context. This tightly couples retrieval to the diffusion process: high-confidence tokens guide retrieval early, while uncertain spans are completed after new evidence is incorporated. Experiments with DREAM-7B, a discrete diffusion language model, on open-domain question answering benchmarks show significant improvements in answer accuracy over static question-only retrieval, while achieving 2--6$\times$ higher throughput than autoregressive baselines, demonstrating that diffusion decoding offers a compelling paradigm for efficient, high-quality retrieval-augmented generation.
Successful Page Load