Timezone: »
We study robust distributed learning that involves minimizing a non-convex loss function with saddle points. We consider the Byzantine setting where some worker machines have abnormal or even arbitrary and adversarial behavior, and in this setting, the Byzantine machines may create fake local minima near a saddle point that is far away from any true local minimum, even when robust gradient estimators are used. We develop ByzantinePGD, a robust first-order algorithm that can provably escape saddle points and fake local minima, and converge to an approximate true local minimizer with low iteration complexity. As a by-product, we give a simpler algorithm and analysis for escaping saddle points in the usual non-Byzantine setting. We further discuss three robust gradient estimators that can be used in ByzantinePGD, including median, trimmed mean, and iterative filtering. We characterize their performance in concrete statistical settings, and argue for their near-optimality in low and high dimensional regimes.
Author Information
Dong Yin (UC Berkeley)
Yudong Chen (Cornell University)
Kannan Ramchandran (UC Berkeley)
Peter Bartlett (UC Berkeley)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Oral: Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning »
Thu. Jun 13th 04:00 -- 04:20 PM Room Room 104
More from the Same Authors
-
2021 : When does gradient descent with logistic loss interpolate using deep networks with smoothed ReLU activations? »
Niladri Chatterji · Phil Long · Peter Bartlett -
2021 : On the Theory of Reinforcement Learning with Once-per-Episode Feedback »
Niladri Chatterji · Aldo Pacchiano · Peter Bartlett · Michael Jordan -
2023 : Competing Bandits in Non-Stationary Matching Markets »
Avishek Ghosh · Abishek Sankararaman · Kannan Ramchandran · Tara Javidi · Arya Mazumdar -
2022 Poster: Neurotoxin: Durable Backdoors in Federated Learning »
Zhengming Zhang · Ashwinee Panda · Linyue Song · Yaoqing Yang · Michael Mahoney · Prateek Mittal · Kannan Ramchandran · Joseph E Gonzalez -
2022 Spotlight: Neurotoxin: Durable Backdoors in Federated Learning »
Zhengming Zhang · Ashwinee Panda · Linyue Song · Yaoqing Yang · Michael Mahoney · Prateek Mittal · Kannan Ramchandran · Joseph E Gonzalez -
2021 : On the Theory of Reinforcement Learning with Once-per-Episode Feedback »
Niladri Chatterji · Aldo Pacchiano · Peter Bartlett · Michael Jordan -
2021 : Adversarial Examples in Random Deep Networks »
Peter Bartlett -
2020 Poster: On Thompson Sampling with Langevin Algorithms »
Eric Mazumdar · Aldo Pacchiano · Yian Ma · Michael Jordan · Peter Bartlett -
2020 Poster: Accelerated Message Passing for Entropy-Regularized MAP Inference »
Jonathan Lee · Aldo Pacchiano · Peter Bartlett · Michael Jordan -
2020 Poster: Stochastic Gradient and Langevin Processes »
Xiang Cheng · Dong Yin · Peter Bartlett · Michael Jordan -
2019 : Spotlight »
Tyler Scott · Kiran Thekumparampil · Jonathan Aigrain · Rene Bidart · Priyadarshini Panda · Dian Ang Yap · Yaniv Yacoby · Raphael Gontijo Lopes · Alberto Marchisio · Erik Englesson · Wanqian Yang · Moritz Graule · Yi Sun · Daniel Kang · Mike Dusenberry · Min Du · Hartmut Maennel · Kunal Menda · Vineet Edupuganti · Luke Metz · David Stutz · Vignesh Srinivasan · Timo Sämann · Vineeth N Balasubramanian · Sina Mohseni · Rob Cornish · Judith Butepage · Zhangyang Wang · Bai Li · Bo Han · Honglin Li · Maksym Andriushchenko · Lukas Ruff · Meet P. Vadera · Yaniv Ovadia · Sunil Thulasidasan · Disi Ji · Gang Niu · Saeed Mahloujifar · Aviral Kumar · SANGHYUK CHUN · Dong Yin · Joyce Xu Xu · Hugo Gomes · Raanan Rohekar -
2019 Poster: Scale-free adaptive planning for deterministic dynamics & discounted rewards »
Peter Bartlett · Victor Gabillon · Jennifer Healey · Michal Valko -
2019 Oral: Scale-free adaptive planning for deterministic dynamics & discounted rewards »
Peter Bartlett · Victor Gabillon · Jennifer Healey · Michal Valko -
2019 Poster: Rademacher Complexity for Adversarially Robust Generalization »
Dong Yin · Kannan Ramchandran · Peter Bartlett -
2019 Poster: Online learning with kernel losses »
Niladri Chatterji · Aldo Pacchiano · Peter Bartlett -
2019 Oral: Rademacher Complexity for Adversarially Robust Generalization »
Dong Yin · Kannan Ramchandran · Peter Bartlett -
2019 Oral: Online learning with kernel losses »
Niladri Chatterji · Aldo Pacchiano · Peter Bartlett -
2018 Poster: Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks »
Peter Bartlett · Dave Helmbold · Phil Long -
2018 Poster: On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo »
Niladri Chatterji · Nicolas Flammarion · Yian Ma · Peter Bartlett · Michael Jordan -
2018 Oral: On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo »
Niladri Chatterji · Nicolas Flammarion · Yian Ma · Peter Bartlett · Michael Jordan -
2018 Oral: Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks »
Peter Bartlett · Dave Helmbold · Phil Long -
2018 Poster: Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates »
Dong Yin · Yudong Chen · Kannan Ramchandran · Peter Bartlett -
2018 Oral: Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates »
Dong Yin · Yudong Chen · Kannan Ramchandran · Peter Bartlett -
2017 Poster: Recovery Guarantees for One-hidden-layer Neural Networks »
Kai Zhong · Zhao Song · Prateek Jain · Peter Bartlett · Inderjit Dhillon -
2017 Poster: The Sample Complexity of Online One-Class Collaborative Filtering »
Reinhard Heckel · Kannan Ramchandran -
2017 Talk: Recovery Guarantees for One-hidden-layer Neural Networks »
Kai Zhong · Zhao Song · Prateek Jain · Peter Bartlett · Inderjit Dhillon -
2017 Talk: The Sample Complexity of Online One-Class Collaborative Filtering »
Reinhard Heckel · Kannan Ramchandran