Timezone: »
Poster
Compressed Decentralized Proximal Stochastic Gradient Method for Nonconvex Composite Problems with Heterogeneous Data
Yonggui Yan · Jie Chen · Pin-Yu Chen · Xiaodong Cui · Songtao Lu · Yangyang Xu
We first propose a decentralized proximal stochastic gradient tracking method (DProxSGT) for nonconvex stochastic composite problems, with data heterogeneously distributed on multiple workers in a decentralized connected network. To save communication cost, we then extend DProxSGT to a compressed method by compressing the communicated information. Both methods need only $\mathcal{O}(1)$ samples per worker for each proximal update, which is important to achieve good generalization performance on training deep neural networks. With a smoothness condition on the expected loss function (but not on each sample function), the proposed methods can achieve an optimal sample complexity result to produce a near-stationary point. Numerical experiments on training neural networks demonstrate the significantly better generalization performance of our methods over large-batch training methods and momentum variance-reduction methods and also, the ability of handling heterogeneous data by the gradient tracking scheme.
Author Information
Yonggui Yan
Jie Chen (MIT-IBM Watson AI Lab, IBM Research)
Pin-Yu Chen (IBM Research)
Xiaodong Cui
Songtao Lu (IBM Thomas J. Watson Research Center)
Yangyang Xu (Rensselaer Polytechnic Institute)
More from the Same Authors
-
2021 : Generalizing Adversarial Training to Composite Semantic Perturbations »
Yun-Yun Tsai · Lei Hsiung · Pin-Yu Chen · Tsung-Yi Ho -
2021 : On the Effectiveness of Poisoning against Unsupervised Domain Adaptation »
Akshay Mehra · Bhavya Kailkhura · Pin-Yu Chen · Jihun Hamm -
2023 : Which Features are Learned by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 : On Robustness-Accuracy Characterization of Large Language Models using Synthetic Datasets »
Ching-Yun (Irene) Ko · Pin-Yu Chen · Payel Das · Yung-Sung Chuang · Luca Daniel -
2023 : Which Features are Learned by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 Oral: Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks »
Mohammed Nowaz Rabbani Chowdhury · Shuai Zhang · Meng Wang · Sijia Liu · Pin-Yu Chen -
2023 Poster: GC-Flow: A Graph-Based Flow Network for Effective Clustering »
Tianchun Wang · Farzaneh Mirzazadeh · Xiang Zhang · Jie Chen -
2023 Poster: MultiRobustBench: Benchmarking Robustness Against Multiple Attacks »
Sophie Dai · Saeed Mahloujifar · Chong Xiang · Vikash Sehwag · Pin-Yu Chen · Prateek Mittal -
2023 Poster: Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 Poster: Prometheus: Taming Sample and Communication Complexities in Constrained Decentralized Stochastic Bilevel Learning »
Zhuqing Liu · Xin Zhang · Prashant Khanduri · Songtao Lu · Jia Liu -
2023 Poster: Identification of the Adversary from a Single Adversarial Example »
Minhao Cheng · Rui Min · Haochen Sun · Pin-Yu Chen -
2023 Poster: Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction »
Minghao Guo · Veronika Thost · Samuel Song · Adithya Balachandran · Payel Das · Jie Chen · Wojciech Matusik -
2023 Oral: Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 Poster: Bilevel Optimization with Coupled Decision-Dependent Distributions »
Songtao Lu -
2023 Poster: Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks »
Mohammed Nowaz Rabbani Chowdhury · Shuai Zhang · Meng Wang · Sijia Liu · Pin-Yu Chen -
2023 Poster: Reprogramming Pretrained Language Models for Antibody Sequence Infilling »
Igor Melnyk · Vijil Chenthamarakshan · Pin-Yu Chen · Payel Das · Amit Dhurandhar · Inkit Padhi · Devleena Das -
2023 Poster: A Gromov--Wasserstein Geometric View of Spectrum-Preserving Graph Coarsening »
Yifan Chen · Rentian Yao · Yun Yang · Jie Chen -
2022 Poster: A Single-Loop Gradient Descent and Perturbed Ascent Algorithm for Nonconvex Functional Constrained Optimization »
Songtao Lu -
2022 Spotlight: A Single-Loop Gradient Descent and Perturbed Ascent Algorithm for Nonconvex Functional Constrained Optimization »
Songtao Lu -
2019 Poster: DAG-GNN: DAG Structure Learning with Graph Neural Networks »
Yue Yu · Jie Chen · Tian Gao · Mo Yu -
2019 Oral: DAG-GNN: DAG Structure Learning with Graph Neural Networks »
Yue Yu · Jie Chen · Tian Gao · Mo Yu -
2019 Poster: PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization »
Songtao Lu · Mingyi Hong · Zhengdao Wang -
2019 Oral: PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization »
Songtao Lu · Mingyi Hong · Zhengdao Wang