Plan, Decouple, Assimilate: Physics-Aware Object Insertion in Remote Sensing Imagery
Abstract
Object insertion has emerged as a promising augmentation paradigm to solve the label scarcity and long-tail distributions in remote sensing. It aims to generate training samples by synthesizing target instances onto real backgrounds. However, existing methods have three critical issues: (i) Semantic placement inconsistency, (ii) Radiometric inconsistency with illumination and atmospheric conditions, and (iii) Textural discontinuity. To cope with these issues, we propose a physics-aware method, called "Plan, Decouple, Assimilate" (PDA), for generating high-fidelity training samples. In the planning stage, the Planning (P) module automatically generates geometrically bounding boxes. In the generation stage, we design a dual-module model to generate the target instance: the Decoupling (D) module employs Asymmetric Spectral Adaptation Decoupling to disentangle structural identity from environmental illumination, while the Assimilation (A) module utilizes Neighborhood-Aware Texture Assimilation to harmonize the local manifold. By strategically integrating these modules, PDA enforces multi-level consistency spanning global geometry to local micro-textures. Extensive experiments verify that PDA consistently outperforms existing state-of-the-art methods in generative quality, reducing whole-image FID by 15.7%, and substantially improves downstream detection performance, boosting average mAP50 by 15.9% over the real-data baseline.