Poster

Enhanced Latent-Space Adversarial Training for Super-Resolution

Liangbin Xie ⋅ Zheyuan Li ⋅ Fanghua Yu ⋅ Xinqi Lin ⋅ Jun-hao Zhuang ⋅ Jinfan Hu ⋅ Jinjin Gu ⋅ Jiantao Zhou ⋅ Chao Dong

Abstract

Real-world super-resolution (SR) is challenging due to complex degradations. HYPIR, a recent state-of-the-art diffusion-based restoration model, struggles to deal with this task in a single step. Although a naive two-step cascade improves the results, over-saturation, limited fine-grained details, and high inference latency remain. To address these limitations, we present HYPIR++. It removes the degradation removal encoder and noise augmentation to better preserve fidelity cues from low-quality inputs. To enhance fine-grained detail restoration and local structure fidelity, HYPIR++ introduces a tailored latent ConvNeXt and a latent patch discriminator, enabling adversarial learning directly in the latent space. In addition, HYPIR++ improves inference efficiency by reducing the text sequence length and replacing full attention with sparse neighbor attention, allowing direct processing of high-resolution images without block-based tiling. Extensive experiments demonstrate that HYPIR++ achieves superior perceptual quality and a 1.71× speedup over HYPIR, establishing a new state-of-the-art for real-world SR.