GuidedBridge: Training-freely Improving Bridge Models with Prior Guidance
Abstract
Guidance methods, e.g., classifier-free guidance (CFG) and auto-guidance (AG), have distinctively improved noise-to-data diffusion generation results. Recently, bridge models have been proposed, which present a data-to-data sampling process to exploit instructive information from clean prior representation, showing advantages on the tasks such as image-to-image translation. In this work, we design a custom guidance method for bridge models, named prior guidance (PG). Different from highlighting condition alignment (CFG) or score accuracy (AG), we training-freely construct an additional weak prior for the pre-trained bridge models, and extrapolate the estimation results to further encourage prior exploitation. Then, we analyze the underlying mechanism of prior exploitation in bridge process and design frequency-modulated prior guidance (FMPG), which tailors the guidance scale to low- and high-frequency bands coherent with bridge generative dynamics. Finally, considering the challenge of bridge models on image in-painting, we develop a cascaded guidance framework, CFG-FMPG, that first generates a coarse prior under global semantic condition and then refines it with FMPG, naturally fulfilling their complementary advantages along sampling trajectory. Experiments conducted on strong pre-trained bridge models, DDBM and DBIM, valid the consistent improvement achieved by our training-free design.