FaPS: A General and Fast Training Method for Diffusion Models
Xianglu Wang ⋅ Bangxian Han ⋅ Hu Ding
Abstract
Diffusion models have achieved state-of-the-art performance in image generation tasks. However, training powerful diffusion models remains time-consuming, which limits their practical deployment. In this paper, we revisit the learning dynamics of diffusion models through the lens of *spectral bias*, a phenomenon in which deep neural networks prioritize learning low-frequency modes. Through an empirical analysis of diffusion training, we observe that diffusion models exhibit a **dual** spectral bias. First, over training iterations, they fit low-frequency components earlier than high-frequency details. Second, along the diffusion timesteps, early denoising steps mainly reconstruct coarse low-frequency content, while high-frequency details emerge in later steps. Motivated by this observation, we propose Frequency-aware Patch Selection **(FaPS)**, a general and fast training method for diffusion models that can be applied to both UNet and DiT backbones. Specifically, FaPS introduces a *frequency-aware gating* that adaptively selects image patches based on their frequency information and focuses computation only on the selected patches. Since the selection decisions are discrete and thus non-differentiable, we model the gating as a stochastic policy network and optimize it end-to-end using a policy gradient method. Our experiments demonstrate that FaPS achieves up to $\mathbf{3}\times$ faster training while maintaining comparable or superior generation quality, and improves the performance of diffusion models in limited-data settings.
Successful Page Load