Factor-Wise Homogeneity of Slot-Attention for Continual Object-Centric Learning
Abstract
While Object-Centric Learning has shown great promise in modular perception, its extension to Continual Learning remains underexplored. In this work, we observe that Slot Attention exhibits a distinctive behavior: it organizes latent representations into small and separated regions, each of which preserves identical factor states, crucially emerging not only in the current task but also across sequential tasks with novel factors. This inter-task separation offers significant advantages in continual learning, which typically suffers from severe object-wise forgetting. We refer to this phenomenon as Factor-Wise Homogeneity, and show that this intrinsic inter-task separation is crucial, serving as a key mechanism to prevent catastrophic forgetting in Continual Object-Centric Learning. However, despite its strong robustness, factor-wise homogeneity alone is insufficient due to the bottleneck in exploiting this separation at the decoder. To overcome this limitation and demonstrate the significance of our findings, we show that a minimal strategy Decoder-only Post-Replay, which freezes the factor-wise homogeneous representations and employs decoder-only fine-tuning, is sufficient. This work serves as a fundamental basis for understanding and leveraging the intrinsic dynamics of Slot Attention, offering essential insights for advancing object-centric systems.