DecFus: Decentralized Layer-wise Fusion with Dynamic Exploration and Exploitation
Abstract
Decentralized Federated Learning (DFL) enables collaborative model training across connected clients without a central server, effectively mitigating communication bottlenecks and avoiding the single point of failure in Centralized Federated Learning (CFL). However, existing DFL methods mostly focus on parameter averaging with compromised update directions, which limits their performance potential due to insufficient exploration of the loss landscape, especially for complex models. We observe that layer exchanges among clients enhance exploration while introducing instability due to highly diverse update directions. To address these limitations, we propose Decentralized Layer-wise Fusion (DecFus), the first DFL framework that unifies layer-level exchange and averaging to balance exploration and exploitation. DecFus dynamically transitions the decentralized training process from exploration-dominant to exploitation-dominant phases, guided by the loss variance among connected neighbors. Furthermore, a layer-wise fusion strategy, informed by pairwise cosine similarity, categorizes all layers into two groups: an exchange group for exploration and an averaging group for exploitation. Specifically, we theoretically establish the convergence of DecFus without relying on the common assumption in existing literature that the aggregation matrix must be doubly stochastic. Extensive experiments demonstrate that DecFus achieves superior performance in both IID and non-IID scenarios, substantially outperforming existing CFL and DFL methods.