Review, Remask, Refine: Process-Guided Block Diffusion for Text Generation
Nikita Mounier · Parsa Idehpour
Keywords:
Windowed Evaluation
Inference-Time Guidance
ext Generation
LLaDA
PRM
Iterative Refinement
Process Reward Models
Computational Efficiency
Self-Correction
Mathematical Reasoning
Qwen2.5-Math-PRM
Block Diffusion
Error Correction
Masked Diffusion Models
Abstract
A key challenge for iterative text generation is enabling models to efficiently identify and correct their own errors. We propose Review, Remask, Refine (R3), a relatively simple yet elegant framework that requires no additional model training and can be applied to any pre-trained masked text diffusion model (e.g., LLaDA or BD3-LM). In R3, a Process Reward Model (PRM) is utilized for the $\textbf{Review}$ of intermediate generated blocks. The framework then translates these PRM scores into a $\textbf{Remask}$ strategy: the lower a block's PRM score, indicating potential mistakes, the greater the proportion of tokens within that block are remasked. Finally, the model is compelled to $\textbf{Refine}$ these targeted segments, focusing its efforts more intensively on specific sub-optimal parts of past generations, leading to improved final output.
Chat is not available.
Successful Page Load