Timezone: »
We study oracle complexity of gradient based methods for stochastic approximation problems. Though in many settings optimal algorithms and tight lower bounds are known for such problems, these optimal algorithms do not achieve the best performance when used in practice. We address this theory-practice gap by focusing on \emph{instance-dependent complexity} instead of worst case complexity. In particular, we first summarize known instance-dependent complexity results and categorize them into three levels. We identify the domination relation between different levels and propose a fourth instance-dependent bound that dominates existing ones. We then provide a sufficient condition according to which an adaptive algorithm with moment estimation can achieve the proposed bound without knowledge of noise levels. Our proposed algorithm and its analysis provide a theoretical justification for the success of moment estimation as it achieves improved instance complexity.
Author Information
Jingzhao Zhang (Tsinghua University)
Hongzhou Lin (Amazon)
Subhro Das (MIT-IBM Watson AI Lab, IBM Research)
Subhro Das is a Research Staff Member and Manager at the MIT-IBM AI Lab, IBM Research, Cambridge MA. As a Principal Investigator (PI), he works on developing novel AI algorithms in collaboration with MIT. He is a Research Affiliate at MIT, co-leading IBM's engagement in the MIT Quest for Intelligence. He serves as the Chair of the AI Learning Professional Interest Community (PIC) at IBM Research. His research interests are broadly in the areas of Trustworthy ML, Reinforcement Learning and ML Optimization. At the MIT-IBM AI Lab, he works on developing novel AI algorithms for uncertainty quantification and human-centric AI systems; robust, accelerated, online & distributed optimization; and, safe, unstable & multi-agent reinforcement learning. He led the Future of Work initiative within IBM Research, studying the impact of AI on the labor market and developing AI-driven recommendation frameworks for skills and talent management. Previously, at the IBM T.J. Watson Research Center in New York, he worked on developing signal processing and machine learning based predictive algorithms for a broad variety of biomedical and healthcare applications. He received MS and PhD degrees in Electrical and Computer Engineering from Carnegie Mellon University in 2014 and 2016, respectively, and Bachelors (B.Tech.) degree in Electronics & Electrical Communication Engineering from Indian Institute of Technology Kharagpur in 2011.
Suvrit Sra (MIT & Macro-Eyes)
Ali Jadbabaie (Massachusetts Institute of Technology)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity »
Wed. Jul 20th through Thu the 21st Room Hall E #601
More from the Same Authors
-
2022 : Fast Convergence for Unstable Reinforcement Learning Problems by Logarithmic Mapping »
Wang Zhang · Lam Nguyen · Subhro Das · Alexandre Megretsky · Luca Daniel · Tsui-Wei Weng -
2023 Poster: Global optimality for Euclidean CCCP under Riemannian convexity »
Melanie Weber · Suvrit Sra -
2023 Poster: On the Training Instability of Shuffling SGD with Batch Normalization »
David X. Wu · Chulhee Yun · Suvrit Sra -
2023 Poster: ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction »
Wang Zhang · Lily Weng · Subhro Das · Alexandre Megretsky · Luca Daniel · Lam Nguyen -
2022 : Sign and Basis Invariant Networks for Spectral Graph Representation Learning »
Derek Lim · Joshua Robinson · Lingxiao Zhao · Tess Smidt · Suvrit Sra · Haggai Maron · Stefanie Jegelka -
2022 Poster: Selective Regression under Fairness Criteria »
Abhin Shah · Yuheng Bu · Joshua Lee · Subhro Das · Rameswar Panda · Prasanna Sattigeri · Gregory Wornell -
2022 Spotlight: Selective Regression under Fairness Criteria »
Abhin Shah · Yuheng Bu · Joshua Lee · Subhro Das · Rameswar Panda · Prasanna Sattigeri · Gregory Wornell -
2022 Poster: Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective »
Jingzhao Zhang · Haochuan Li · Suvrit Sra · Ali Jadbabaie -
2022 Poster: Understanding the unstable convergence of gradient descent »
Kwangjun Ahn · Jingzhao Zhang · Suvrit Sra -
2022 Poster: On Convergence of Gradient Descent Ascent: A Tight Local Analysis »
Haochuan Li · Farzan Farnia · Subhro Das · Ali Jadbabaie -
2022 Spotlight: On Convergence of Gradient Descent Ascent: A Tight Local Analysis »
Haochuan Li · Farzan Farnia · Subhro Das · Ali Jadbabaie -
2022 Spotlight: Understanding the unstable convergence of gradient descent »
Kwangjun Ahn · Jingzhao Zhang · Suvrit Sra -
2022 Spotlight: Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective »
Jingzhao Zhang · Haochuan Li · Suvrit Sra · Ali Jadbabaie -
2021 Poster: Fair Selective Classification Via Sufficiency »
Joshua Lee · Yuheng Bu · Deepta Rajan · Prasanna Sattigeri · Rameswar Panda · Subhro Das · Gregory Wornell -
2021 Oral: Fair Selective Classification Via Sufficiency »
Joshua Lee · Yuheng Bu · Deepta Rajan · Prasanna Sattigeri · Rameswar Panda · Subhro Das · Gregory Wornell -
2021 Poster: Provably Efficient Algorithms for Multi-Objective Competitive RL »
Tiancheng Yu · Yi Tian · Jingzhao Zhang · Suvrit Sra -
2021 Poster: Online Learning in Unknown Markov Games »
Yi Tian · Yuanhao Wang · Tiancheng Yu · Suvrit Sra -
2021 Spotlight: Online Learning in Unknown Markov Games »
Yi Tian · Yuanhao Wang · Tiancheng Yu · Suvrit Sra -
2021 Oral: Provably Efficient Algorithms for Multi-Objective Competitive RL »
Tiancheng Yu · Yi Tian · Jingzhao Zhang · Suvrit Sra -
2021 Poster: Three Operator Splitting with a Nonconvex Loss Function »
Alp Yurtsever · Varun Mangalick · Suvrit Sra -
2021 Spotlight: Three Operator Splitting with a Nonconvex Loss Function »
Alp Yurtsever · Varun Mangalick · Suvrit Sra -
2020 Poster: Strength from Weakness: Fast Learning Using Weak Supervision »
Joshua Robinson · Stefanie Jegelka · Suvrit Sra -
2020 Poster: Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions »
Jingzhao Zhang · Hongzhou Lin · Stefanie Jegelka · Suvrit Sra · Ali Jadbabaie -
2020 Poster: Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition »
Chi Jin · Tiancheng Jin · Haipeng Luo · Suvrit Sra · Tiancheng Yu -
2019 Poster: Escaping Saddle Points with Adaptive Gradient Methods »
Matthew Staib · Sashank Jakkam Reddi · Satyen Kale · Sanjiv Kumar · Suvrit Sra -
2019 Oral: Escaping Saddle Points with Adaptive Gradient Methods »
Matthew Staib · Sashank Jakkam Reddi · Satyen Kale · Sanjiv Kumar · Suvrit Sra -
2019 Poster: Random Shuffling Beats SGD after Finite Epochs »
Jeff HaoChen · Suvrit Sra -
2019 Oral: Random Shuffling Beats SGD after Finite Epochs »
Jeff HaoChen · Suvrit Sra -
2019 Poster: Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator »
Alp Yurtsever · Suvrit Sra · Volkan Cevher -
2019 Oral: Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator »
Alp Yurtsever · Suvrit Sra · Volkan Cevher