PACEAttention: Principled and Adaptive Feature Compression-Expansion Grounded in the Geometry of $\text{MCR}^2$
Xiaojie Yu ⋅ Haibo Zhang ⋅ Jeremiah D. Deng ⋅ Lizhi Peng
Abstract
The maximal coding rate reduction ($\text{MCR}^2$) objective is proposed for learning low-dimensional subspace representations and for principled deep model design, where layer structures are derived by unrolling its optimization steps. However, existing methods motivated by this objective do not fully adhere to design principles implied by the $\text{MCR}^2$ gradient, which weakens the principled and interpretable foundations of the resulting models. In this work, we introduce PACEAttention, a novel principled attention mechanism inspired by the \textit{geometric insight }of $\text{MCR}^2$. From the geometric perspective, gradient-based updates of $\text{MCR}^2$ move features along directions shaped by the underlying low-dimensional feature structure. Our method captures this structure by leveraging randomization to guide feature updates. This principled construction enables the resulting PACENet to exhibit enhanced interpretability, with different heads attending to distinct image regions and capturing \textit{fine-grained} structures under simple supervised training. Besides, two learnable weights in PACEAttention enable explicit regulation of the feature update dynamics, reflecting the relative contributions of different components across layers. Experiments demonstrate that PACEAttention achieves superior performance and more stable scalability than previous principled modules while remaining low complexity.
Successful Page Load