HPS: Hyperspherical Parameter Sharing for Efficient Multi-Agent Reinforcement Learning
Abstract
Parameter Sharing (PS) is widely used to improve efficiency in Multi-Agent Reinforcement Learning (MARL), but it can limit behavioral diversity and degrade performance. This limitation stems from gradient conflicts among agents on shared weights, which hinders effective policy learning. To fully characterize this phenomenon, we propose Geometric Gradient Decomposition Analysis that decomposes gradients with respect to weight vector into radial (scale) and tangential (direction) components and uncover a key insight: agents largely agree on directional updates but substantially disagree on scale updates. Consequently, while recent methods split the shared network into agent-specific subnetworks to mitigate conflicts, they also discard shared directional updates, limiting training efficiency. To address this issue, we propose Hyperspherical Parameter Sharing (HPS), which explicitly decouples direction and scale in parameter sharing. Specifically, HPS constrains the shared backbone weights onto a Riemannian manifold(unit hypersphere), enforcing purely directional learning. Building on this, an agent-specific scale generator outputs multiplicative modulation factors to adjust each agent’s scales, thus preserving heterogeneous response magnitudes without disrupting the shared directions. Experiments on SMAC, SMACv2, VMAS and Predator Prey demonstrate that HPS effectively resolves the scale conflict, significantly outperforming state-of-the-art methods.