Bend the Basics: Degradation-Aware Deformable Tokenization for All-in-One Image Restoration
Zihao He ⋅ Yunfeng Wu ⋅ Xinchao Wang ⋅ Songhua Liu
Abstract
All-in-one image restoration seeks a single model that can recover images degraded by diverse and spatially non-uniform corruptions. However, many unified Transformers rely on fixed patch partitioning: task/degradation condition is injected only into the backbone blocks after tokenization, leaving the embedding and reconstruction stages insensitive to local degradation variations. In contrast to previous approaches, we present \textbf{Flexible Image Transformer (FIT) that explicitly models degradation awareness across the \emph{entire} pipeline, from patch sampling to pixel reconstruction. Specifically, FIT employs a lightweight Degradation Encoder to predict a global degradation vector $\mathbf{g}$ and a spatial degradation map $\mathbf{M}$ from local degradation severity, {which jointly condition the patch embedding and unembedding through adaptive deformation. Moreover, to improve robustness across degradation types, we introduce a task-token dropout strategy that regularizes task conditioning during training. On five standard benchmarks (BSD68, Rain100L, SOTS, GoPro, and LOLv1), FIT achieves state-of-the-art performance with {30.72 dB} average PSNR on the five-degradation setting and 32.83 dB on the three-degradation setting, outperforming recent unified restoration methods by +0.5$\sim$1.1 dB. Moreover, the learned offsets provide a direct handle for visualizing degradation-aware spatial adaptation.
Successful Page Load