Restoring Initial Noise Sensitivity in Text-to-Image Distillation through Geometric Alignment
Abstract
Generative distillation significantly accelerates text-to-image (T2I) generation by compressing multi-step trajectories into few-step student models while preserving perceptual quality. However, existing distillation methods prioritize efficiency and output fidelity, often overlooking the preservation of critical properties inherent to the original trajectory. In this work, we identify a key lost property: sensitivity to initial noise, the absence of which impairs downstream control methods that rely on noise-based optimization and manipulation. We trace this deficiency to standard distillation objectives, which enforce pointwise output alignment. This inadvertently flattens the input-output landscape and suppresses the local geometric structure present in the teacher model. To address this, we propose Geometry-Aware Distillation (GAD), a sensitivity-preserving framework that explicitly aligns the local functional behavior of the teacher and student. GAD enforces geometric consistency by matching Jacobian-vector products with respect to input noise, ensuring the student faithfully reproduces the teacher’s differential response to perturbations. Extensive experiments across multiple T2I paradigms and noise-driven control tasks demonstrate that GAD significantly recovers sensitivity and improves diversity, while maintaining high visual fidelity.