AdaEraser: Training-Free Object Removal via Adaptive Attention Suppression
Abstract
Object removal aims to eliminate specified objects from images while plausibly inpainting the affected regions with background content. Current training-free methods typically block attention to object regions within self-attention layers during the image generation process, leveraging surrounding background information to restore the image. However, indiscriminate suppression of self-attention in the vacated areas can degrade generation quality, as the model must simultaneously reconstruct background content in these regions. To solve this conflict, we propose AdaEraser, an adaptive framework that dynamically modulates attention based on the estimated presence of target object concepts. Through analysis of self-attention map evolution across denoising timesteps before and during removal, we develop a token-wise adaptive attention suppression strategy. This approach enables progressive perception of object removal throughout the denoising process, with the suppression strength in self-attention layers adjusted adaptively. Extensive experiments demonstrate that AdaEraser achieves superior performance in object removal, outperforming even training-based methods.