Timezone: »
For many of the reinforcement learning applications, the system is assumed to be inherently stable and with bounded reward, state and action space. These are key requirements for the optimization convergence of classical reinforcement learning reward function with discount factors. Unfortunately, these assumptions do not hold true for many real world problems such as an unstable linear–quadratic regulator (LQR). In this work, we propose new methods to stabilize and speed up the convergence of unstable reinforcement learning problems with the policy gradient methods. We provide theoretical insights on the efficiency of our methods. In practice, we achieve good experimental results over multiple examples where the vanilla methods mostly fail to converge due to system instability.
Author Information
Wang Zhang (MIT)
Lam Nguyen (IBM Research, Thomas J. Watson Research Center)
Subhro Das (MIT-IBM Watson AI Lab, IBM Research)
Subhro Das is a Research Staff Member and Manager at the MIT-IBM AI Lab, IBM Research, Cambridge MA. As a Principal Investigator (PI), he works on developing novel AI algorithms in collaboration with MIT. He is a Research Affiliate at MIT, co-leading IBM's engagement in the MIT Quest for Intelligence. He serves as the Chair of the AI Learning Professional Interest Community (PIC) at IBM Research. His research interests are broadly in the areas of Trustworthy ML, Reinforcement Learning and ML Optimization. At the MIT-IBM AI Lab, he works on developing novel AI algorithms for uncertainty quantification and human-centric AI systems; robust, accelerated, online & distributed optimization; and, safe, unstable & multi-agent reinforcement learning. He led the Future of Work initiative within IBM Research, studying the impact of AI on the labor market and developing AI-driven recommendation frameworks for skills and talent management. Previously, at the IBM T.J. Watson Research Center in New York, he worked on developing signal processing and machine learning based predictive algorithms for a broad variety of biomedical and healthcare applications. He received MS and PhD degrees in Electrical and Computer Engineering from Carnegie Mellon University in 2014 and 2016, respectively, and Bachelors (B.Tech.) degree in Electronics & Electrical Communication Engineering from Indian Institute of Technology Kharagpur in 2011.
Alexandre Megretsky (Massachusetts Institute of Technology)
Luca Daniel (Massachusetts Institute of Technology)
Tsui-Wei Weng (MIT)
More from the Same Authors
-
2021 : Robust online control with model misspecification »
Xinyi Chen · Udaya Ghai · Elad Hazan · Alexandre Megretsky -
2023 : Group Fairness with Uncertainty in Sensitive Attributes »
Abhin Shah · Maohao Shen · Jongha Ryu · Subhro Das · Prasanna Sattigeri · Yuheng Bu · Gregory Wornell -
2023 : On Robustness-Accuracy Characterization of Large Language Models using Synthetic Datasets »
Ching-Yun (Irene) Ko · Pin-Yu Chen · Payel Das · Yung-Sung Chuang · Luca Daniel -
2023 : On Robustness-Accuracy Characterization of Large Language Models using Synthetic Datasets »
Ching-Yun (Irene) Ko · Pin-Yu Chen · Payel Das · Yung-Sung Chuang · Luca Daniel -
2023 Poster: ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction »
Wang Zhang · Lily Weng · Subhro Das · Alexandre Megretsky · Luca Daniel · Lam Nguyen -
2022 Poster: Nesterov Accelerated Shuffling Gradient Method for Convex Optimization »
Trang Tran · Katya Scheinberg · Lam Nguyen -
2022 Poster: Selective Regression under Fairness Criteria »
Abhin Shah · Yuheng Bu · Joshua Lee · Subhro Das · Rameswar Panda · Prasanna Sattigeri · Gregory Wornell -
2022 Spotlight: Selective Regression under Fairness Criteria »
Abhin Shah · Yuheng Bu · Joshua Lee · Subhro Das · Rameswar Panda · Prasanna Sattigeri · Gregory Wornell -
2022 Spotlight: Nesterov Accelerated Shuffling Gradient Method for Convex Optimization »
Trang Tran · Katya Scheinberg · Lam Nguyen -
2022 Poster: Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity »
Jingzhao Zhang · Hongzhou Lin · Subhro Das · Suvrit Sra · Ali Jadbabaie -
2022 Spotlight: Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity »
Jingzhao Zhang · Hongzhou Lin · Subhro Das · Suvrit Sra · Ali Jadbabaie -
2022 Poster: Revisiting Contrastive Learning through the Lens of Neighborhood Component Analysis: an Integrated Framework »
Ching-Yun (Irene) Ko · Jeet Mohapatra · Sijia Liu · Pin-Yu Chen · Luca Daniel · Lily Weng -
2022 Poster: On Convergence of Gradient Descent Ascent: A Tight Local Analysis »
Haochuan Li · Farzan Farnia · Subhro Das · Ali Jadbabaie -
2022 Spotlight: On Convergence of Gradient Descent Ascent: A Tight Local Analysis »
Haochuan Li · Farzan Farnia · Subhro Das · Ali Jadbabaie -
2022 Spotlight: Revisiting Contrastive Learning through the Lens of Neighborhood Component Analysis: an Integrated Framework »
Ching-Yun (Irene) Ko · Jeet Mohapatra · Sijia Liu · Pin-Yu Chen · Luca Daniel · Lily Weng -
2021 Poster: Fair Selective Classification Via Sufficiency »
Joshua Lee · Yuheng Bu · Deepta Rajan · Prasanna Sattigeri · Rameswar Panda · Subhro Das · Gregory Wornell -
2021 Oral: Fair Selective Classification Via Sufficiency »
Joshua Lee · Yuheng Bu · Deepta Rajan · Prasanna Sattigeri · Rameswar Panda · Subhro Das · Gregory Wornell -
2021 Poster: SMG: A Shuffling Gradient-Based Method with Momentum »
Trang Tran · Lam Nguyen · Quoc Tran-Dinh -
2021 Spotlight: SMG: A Shuffling Gradient-Based Method with Momentum »
Trang Tran · Lam Nguyen · Quoc Tran-Dinh -
2020 Poster: Neural Network Control Policy Verification With Persistent Adversarial Perturbation »
Yuh-Shyang Wang · Tsui-Wei Weng · Luca Daniel -
2020 Poster: Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization »
Quoc Tran-Dinh · Nhan H Pham · Lam Nguyen -
2020 Poster: Proper Network Interpretability Helps Adversarial Robustness in Classification »
Akhilan Boopathy · Sijia Liu · Gaoyuan Zhang · Cynthia Liu · Pin-Yu Chen · Shiyu Chang · Luca Daniel -
2019 Poster: POPQORN: Quantifying Robustness of Recurrent Neural Networks »
CHING-YUN KO · Zhaoyang Lyu · Tsui-Wei Weng · Luca Daniel · Ngai Wong · Dahua Lin -
2019 Poster: Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD »
Marten van Dijk · Lam Nguyen · PHUONG_HA NGUYEN · Dzung Phan -
2019 Poster: PROVEN: Verifying Robustness of Neural Networks with a Probabilistic Approach »
Tsui-Wei Weng · Pin-Yu Chen · Lam Nguyen · Mark Squillante · Akhilan Boopathy · Ivan Oseledets · Luca Daniel -
2019 Oral: Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD »
Marten van Dijk · Lam Nguyen · PHUONG_HA NGUYEN · Dzung Phan -
2019 Oral: PROVEN: Verifying Robustness of Neural Networks with a Probabilistic Approach »
Tsui-Wei Weng · Pin-Yu Chen · Lam Nguyen · Mark Squillante · Akhilan Boopathy · Ivan Oseledets · Luca Daniel -
2019 Oral: POPQORN: Quantifying Robustness of Recurrent Neural Networks »
CHING-YUN KO · Zhaoyang Lyu · Tsui-Wei Weng · Luca Daniel · Ngai Wong · Dahua Lin -
2018 Poster: Towards Fast Computation of Certified Robustness for ReLU Networks »
Tsui-Wei Weng · Huan Zhang · Hongge Chen · Zhao Song · Cho-Jui Hsieh · Luca Daniel · Duane Boning · Inderjit Dhillon -
2018 Oral: Towards Fast Computation of Certified Robustness for ReLU Networks »
Tsui-Wei Weng · Huan Zhang · Hongge Chen · Zhao Song · Cho-Jui Hsieh · Luca Daniel · Duane Boning · Inderjit Dhillon -
2018 Poster: SGD and Hogwild! Convergence Without the Bounded Gradients Assumption »
Lam Nguyen · PHUONG_HA NGUYEN · Marten van Dijk · Peter Richtarik · Katya Scheinberg · Martin Takac -
2018 Oral: SGD and Hogwild! Convergence Without the Bounded Gradients Assumption »
Lam Nguyen · PHUONG_HA NGUYEN · Marten van Dijk · Peter Richtarik · Katya Scheinberg · Martin Takac -
2017 Poster: SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient »
Lam Nguyen · Jie Liu · Katya Scheinberg · Martin Takac -
2017 Talk: SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient »
Lam Nguyen · Jie Liu · Katya Scheinberg · Martin Takac