Timezone: »
To scale up data analysis, distributed and parallel computing approaches are increasingly needed. Here we study a fundamental problem in this area: How to do ridge regression in a distributed computing environment? We study one-shot methods constructing weighted combinations of ridge regression estimators computed on each machine. By analyzing the mean squared error in a high dimensional model where each predictor has a small effect, we discover several new phenomena including that the efficiency depends strongly on the signal strength, but does not degrade with many workers, the risk decouples over machines, and the unexpected consequence that the optimal weights do not sum to unity. We also propose a new optimally weighted one-shot ridge regression algorithm. Our results are supported by simulations and real data analysis.
Author Information
Yue Sheng (University of Pennsylvania)
Edgar Dobriban (University of Pennsylvania)
More from the Same Authors
-
2022 Poster: Unified Fourier-based Kernel and Nonlinearity Design for Equivariant Networks on Homogeneous Spaces »
Yinshuang Xu · Jiahui Lei · Edgar Dobriban · Kostas Daniilidis -
2022 Spotlight: Unified Fourier-based Kernel and Nonlinearity Design for Equivariant Networks on Homogeneous Spaces »
Yinshuang Xu · Jiahui Lei · Edgar Dobriban · Kostas Daniilidis -
2020 Poster: The Implicit Regularization of Stochastic Gradient Flow for Least Squares »
Alnur Ali · Edgar Dobriban · Ryan Tibshirani -
2020 Poster: DeltaGrad: Rapid retraining of machine learning models »
Yinjun Wu · Edgar Dobriban · Susan B Davidson