Timezone: »
Agents trained by reinforcement learning (RL) often fail to generalize beyond the environment they were trained in, even when presented with new scenarios that seem similar to the training environment. We study the query complexity required to train RL agents that generalize to multiple environments. Intuitively, tractable generalization is only possible when the environments are similar or close in some sense. To capture this, we introduce Weak Proximity, a natural structural condition that requires the environments to have highly similar transition and reward functions and share a policy providing optimal value. Despite such shared structure, we prove that tractable generalization is impossible in the worst case. This holds even when each individual environment can be efficiently solved to obtain an optimal linear policy, and when the agent possesses a generative model. Our lower bound applies to the more complex task of representation learning for efficient generalization to multiple environments. On the positive side, we introduce Strong Proximity, a strengthened condition which we prove is sufficient for efficient generalization.
Author Information
Dhruv Malik (Carnegie Mellon University)
Yuanzhi Li (CMU)
Pradeep Ravikumar (Carnegie Mellon University)
More from the Same Authors
-
2021 : Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity »
Dhruv Malik · Aldo Pacchiano · Vishwak Srinivasan · Yuanzhi Li -
2021 : Towards understanding how momentum improves generalization in deep learning »
Samy Jelassi · Yuanzhi Li -
2023 : How Does Adaptive Optimization Impact Local Neural Network Geometry? »
Kaiqi Jiang · Dhruv Malik · Yuanzhi Li -
2023 : Identifying Causal Mechanism Shifts among Nonlinear Additive Noise Models »
Tianyu Chen · Kevin Bello · Bryon Aragam · Pradeep Ravikumar -
2023 : Learning Linear Causal Representations from Interventions under General Nonlinear Mixing »
Simon Buchholz · Goutham Rajendran · Elan Rosenfeld · Bryon Aragam · Bernhard Schölkopf · Pradeep Ravikumar -
2023 : Learning Linear Causal Representations from Interventions under General Nonlinear Mixing »
Simon Buchholz · Goutham Rajendran · Elan Rosenfeld · Bryon Aragam · Bernhard Schölkopf · Pradeep Ravikumar -
2023 : Plan, Eliminate, and Track --- Language Models are Good Teachers for Embodied Agents. »
Yue Wu · So Yeon Min · Yonatan Bisk · Ruslan Salakhutdinov · Amos Azaria · Yuanzhi Li · Tom Mitchell · Shrimai Prabhumoye -
2023 : SPRING: Studying Papers and Reasoning to play Games »
Yue Wu · Shrimai Prabhumoye · So Yeon Min · Yonatan Bisk · Ruslan Salakhutdinov · Amos Azaria · Tom Mitchell · Yuanzhi Li -
2023 : Learning with Explanation Constraints »
Rattana Pukdee · Dylan Sam · Nina Balcan · Pradeep Ravikumar -
2023 : How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding »
Yuchen Li · Yuanzhi Li · Andrej Risteski -
2023 : Learning Linear Causal Representations from Interventions under General Nonlinear Mixing »
Simon Buchholz · Goutham Rajendran · Elan Rosenfeld · Bryon Aragam · Bernhard Schölkopf · Pradeep Ravikumar -
2023 : Global Optimality in Bivariate Gradient-based DAG Learning »
Chang Deng · Kevin Bello · Pradeep Ravikumar · Bryon Aragam -
2023 Poster: Weighted Tallying Bandits: Overcoming Intractability via Repeated Exposure Optimality »
Dhruv Malik · Conor Igoe · Yuanzhi Li · Aarti Singh -
2023 Poster: How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding »
Yuchen Li · Yuanzhi Li · Andrej Risteski -
2023 Poster: Optimizing NOTEARS Objectives via Topological Swaps »
Chang Deng · Kevin Bello · Bryon Aragam · Pradeep Ravikumar -
2023 Poster: Representer Point Selection for Explaining Regularized High-dimensional Models »
Che-Ping Tsai · Jiong Zhang · Hsiang-Fu Yu · Eli Chien · Cho-Jui Hsieh · Pradeep Ravikumar -
2023 Poster: The Benefits of Mixup for Feature Learning »
Difan Zou · Yuan Cao · Yuanzhi Li · Quanquan Gu -
2023 Poster: Faith-Shap: The Faithful Shapley Interaction Index »
Che-Ping Tsai · Chih-Kuan Yeh · Pradeep Ravikumar -
2022 Poster: Building Robust Ensembles via Margin Boosting »
Dinghuai Zhang · Hongyang Zhang · Aaron Courville · Yoshua Bengio · Pradeep Ravikumar · Arun Sai Suggala -
2022 Spotlight: Building Robust Ensembles via Margin Boosting »
Dinghuai Zhang · Hongyang Zhang · Aaron Courville · Yoshua Bengio · Pradeep Ravikumar · Arun Sai Suggala -
2022 Poster: Towards understanding how momentum improves generalization in deep learning »
Samy Jelassi · Yuanzhi Li -
2022 Spotlight: Towards understanding how momentum improves generalization in deep learning »
Samy Jelassi · Yuanzhi Li -
2021 : Towards understanding how momentum improves generalization in deep learning »
Samy Jelassi · Yuanzhi Li -
2021 Poster: DORO: Distributional and Outlier Robust Optimization »
Runtian Zhai · Chen Dan · Zico Kolter · Pradeep Ravikumar -
2021 Poster: Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity »
Dhruv Malik · Aldo Pacchiano · Vishwak Srinivasan · Yuanzhi Li -
2021 Spotlight: DORO: Distributional and Outlier Robust Optimization »
Runtian Zhai · Chen Dan · Zico Kolter · Pradeep Ravikumar -
2021 Spotlight: Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity »
Dhruv Malik · Aldo Pacchiano · Vishwak Srinivasan · Yuanzhi Li -
2021 Poster: Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning »
Zixin Wen · Yuanzhi Li -
2021 Spotlight: Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning »
Zixin Wen · Yuanzhi Li -
2021 Poster: On Proximal Policy Optimization's Heavy-tailed Gradients »
Saurabh Garg · Joshua Zhanson · Emilio Parisotto · Adarsh Prasad · Zico Kolter · Zachary Lipton · Sivaraman Balakrishnan · Ruslan Salakhutdinov · Pradeep Ravikumar -
2021 Spotlight: On Proximal Policy Optimization's Heavy-tailed Gradients »
Saurabh Garg · Joshua Zhanson · Emilio Parisotto · Adarsh Prasad · Zico Kolter · Zachary Lipton · Sivaraman Balakrishnan · Ruslan Salakhutdinov · Pradeep Ravikumar -
2020 Poster: Uniform Convergence of Rank-weighted Learning »
Justin Khim · Liu Leqi · Adarsh Prasad · Pradeep Ravikumar -
2020 Poster: Sharp Statistical Guaratees for Adversarially Robust Gaussian Classification »
Chen Dan · Yuting Wei · Pradeep Ravikumar -
2020 Poster: Class-Weighted Classification: Trade-offs and Robust Approaches »
Ziyu Xu · Chen Dan · Justin Khim · Pradeep Ravikumar -
2020 Poster: Certified Robustness to Label-Flipping Attacks via Randomized Smoothing »
Elan Rosenfeld · Ezra Winston · Pradeep Ravikumar · Zico Kolter -
2018 Poster: Binary Classification with Karmic, Threshold-Quasi-Concave Metrics »
Bowei Yan · Sanmi Koyejo · Kai Zhong · Pradeep Ravikumar -
2018 Poster: Loss Decomposition for Fast Learning in Large Output Spaces »
En-Hsu Yen · Satyen Kale · Felix Xinnan Yu · Daniel Holtmann-Rice · Sanjiv Kumar · Pradeep Ravikumar -
2018 Oral: Binary Classification with Karmic, Threshold-Quasi-Concave Metrics »
Bowei Yan · Sanmi Koyejo · Kai Zhong · Pradeep Ravikumar -
2018 Oral: Loss Decomposition for Fast Learning in Large Output Spaces »
En-Hsu Yen · Satyen Kale · Felix Xinnan Yu · Daniel Holtmann-Rice · Sanjiv Kumar · Pradeep Ravikumar -
2018 Poster: Deep Density Destructors »
David Inouye · Pradeep Ravikumar -
2018 Oral: Deep Density Destructors »
David Inouye · Pradeep Ravikumar -
2017 Poster: Ordinal Graphical Models: A Tale of Two Approaches »
ARUN SAI SUGGALA · Eunho Yang · Pradeep Ravikumar -
2017 Poster: Doubly Greedy Primal-Dual Coordinate Descent for Sparse Empirical Risk Minimization »
Qi Lei · En-Hsu Yen · Chao-Yuan Wu · Inderjit Dhillon · Pradeep Ravikumar -
2017 Poster: Latent Feature Lasso »
En-Hsu Yen · Wei-Cheng Lee · Sung-En Chang · Arun Suggala · Shou-De Lin · Pradeep Ravikumar -
2017 Talk: Doubly Greedy Primal-Dual Coordinate Descent for Sparse Empirical Risk Minimization »
Qi Lei · En-Hsu Yen · Chao-Yuan Wu · Inderjit Dhillon · Pradeep Ravikumar -
2017 Talk: Ordinal Graphical Models: A Tale of Two Approaches »
ARUN SAI SUGGALA · Eunho Yang · Pradeep Ravikumar -
2017 Talk: Latent Feature Lasso »
En-Hsu Yen · Wei-Cheng Lee · Sung-En Chang · Arun Suggala · Shou-De Lin · Pradeep Ravikumar