FedUSD: Unbiased Synthetic Data for Federated Learning
Weiying Xie ⋅ Chenhe Hao ⋅ Haozhi Shi ⋅ Jitao Ma ⋅ Daixun Li ⋅ Jiazhe Li ⋅ Hengyi Wang ⋅ Leyuan Fang ⋅ Yunsong Li
Abstract
Aggregation-Free Federated Learning enables joint training by sharing synthetic data, aiming to eliminate data heterogeneity across clients. However, existing methods fail to explicitly separate the principal and residual components of dataset, leading to biased synthetic data. In this paper, we propose a novel Unbiased Synthetic Data optimization method FedUSD for Aggregation-Free Federated Learning, which is achieved by exploring the High-energy Orthogonal Base (HOB) and variance of dataset in feature space. Our FedUSD is inspired by the discovery that principal component concentrates in HOB while residual component independently reflects in variance, regardless of networks. Based on the observation, we develop a method that mathematically optimizes synthetic data by matching both HOB and variance with those of real data. Besides, we experimentally show the superior effectiveness of leveraging HOB and variance to separately extract the principal and residual components over existing methods. We also theoretically prove that FedUSD achieves unbiased synthetic data and thus convergence. Without introducing any constraints, FedUSD thereby yields significant improvements over the state-of-the-arts in terms of global model performance, under equivalent communicational costs. For example, on the SVHN dataset, FedUSD improves 6.74\% to 30.82\% which is higher than others with Dirichlet coefficient $\alpha=0.01$.
Successful Page Load