Timezone: »
A common strategy in modern learning systems is to learn a representation that is useful for many tasks, a.k.a. representation learning. We study this strategy in the imitation learning setting for Markov decision processes (MDPs) where multiple experts' trajectories are available. We formulate representation learning as a bi-level optimization problem where the outer" optimization tries to learn the joint representation and the
inner" optimization encodes the imitation learning setup and tries to learn task-specific parameters. We instantiate this framework for the imitation learning settings of behavior cloning and observation-alone. Theoretically, we show using our framework that representation learning can provide sample complexity benefits for imitation learning in both settings. We also provide proof-of-concept experiments to verify our theory.
Author Information
Sanjeev Arora (Princeton University and Institute for Advanced Study)
Simon Du (Institute for Advanced Study)
Sham Kakade (University of Washington)
Sham Kakade is a Gordon McKay Professor of Computer Science and Statistics at Harvard University and a co-director of the recently announced Kempner Institute. He works on the mathematical foundations of machine learning and AI. Sham's thesis helped in laying the statistical foundations of reinforcement learning. With his collaborators, his additional contributions include: one of the first provably efficient policy search methods, Conservative Policy Iteration, for reinforcement learning; developing the mathematical foundations for the widely used linear bandit models and the Gaussian process bandit models; the tensor and spectral methodologies for provable estimation of latent variable models; the first sharp analysis of the perturbed gradient descent algorithm, along with the design and analysis of numerous other convex and non-convex algorithms. He is the recipient of the ICML Test of Time Award (2020), the IBM Pat Goldberg best paper award (in 2007), INFORMS Revenue Management and Pricing Prize (2014). He has been program chair for COLT 2011. Sham was an undergraduate at Caltech, where he studied physics and worked under the guidance of John Preskill in quantum computing. He then completed his Ph.D. in computational neuroscience at the Gatsby Unit at University College London, under the supervision of Peter Dayan. He was a postdoc at the Dept. of Computer Science, University of Pennsylvania , where he broadened his studies to include computational game theory and economics from the guidance of Michael Kearns. Sham has been a Principal Research Scientist at Microsoft Research, New England, an associate professor at the Department of Statistics, Wharton, UPenn, and an assistant professor at the Toyota Technological Institute at Chicago.
Yuping Luo (Princeton University)
Nikunj Umesh Saunshi (Princeton University)
More from the Same Authors
-
2021 : A Short Note on the Relationship of Information Gain and Eluder Dimension »
Kaixuan Huang · Sham Kakade · Jason Lee · Qi Lei -
2021 : Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations »
Yuping Luo · Tengyu Ma -
2021 : Sparsity in the Partially Controllable LQR »
Yonathan Efroni · Sham Kakade · Akshay Krishnamurthy · Cyril Zhang -
2023 : Fine-Tuning Language Models with Just Forward Passes »
Sadhika Malladi · Tianyu Gao · Eshaan Nichani · Jason Lee · Danqi Chen · Sanjeev Arora -
2023 : 🎤 Fine-Tuning Language Models with Just Forward Passes »
Sadhika Malladi · Tianyu Gao · Eshaan Nichani · Alex Damian · Jason Lee · Danqi Chen · Sanjeev Arora -
2023 : High-dimensional Optimization in the Age of ChatGPT, Sanjeev Arora »
Sanjeev Arora -
2023 Poster: Task-Specific Skill Localization in Fine-tuned Language Models »
Abhishek Panigrahi · Nikunj Saunshi · Haoyu Zhao · Sanjeev Arora -
2023 Poster: A Kernel-Based View of Language Model Fine-Tuning »
Sadhika Malladi · Alexander Wettig · Dingli Yu · Danqi Chen · Sanjeev Arora -
2022 : On the SDEs and Scaling Rules for Adaptive Gradient Algorithms »
Sadhika Malladi · Kaifeng Lyu · Abhishek Panigrahi · Sanjeev Arora -
2022 : Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence toMirror Descent »
Zhiyuan Li · Tianhao Wang · Jason Lee · Sanjeev Arora -
2022 Poster: Understanding Contrastive Learning Requires Incorporating Inductive Biases »
Nikunj Umesh Saunshi · Jordan Ash · Surbhi Goel · Dipendra Kumar Misra · Cyril Zhang · Sanjeev Arora · Sham Kakade · Akshay Krishnamurthy -
2022 Spotlight: Understanding Contrastive Learning Requires Incorporating Inductive Biases »
Nikunj Umesh Saunshi · Jordan Ash · Surbhi Goel · Dipendra Kumar Misra · Cyril Zhang · Sanjeev Arora · Sham Kakade · Akshay Krishnamurthy -
2022 Poster: Understanding Gradient Descent on the Edge of Stability in Deep Learning »
Sanjeev Arora · Zhiyuan Li · Abhishek Panigrahi -
2022 Spotlight: Understanding Gradient Descent on the Edge of Stability in Deep Learning »
Sanjeev Arora · Zhiyuan Li · Abhishek Panigrahi -
2021 : Sparsity in the Partially Controllable LQR »
Yonathan Efroni · Sham Kakade · Akshay Krishnamurthy · Cyril Zhang -
2021 Poster: A Representation Learning Perspective on the Importance of Train-Validation Splitting in Meta-Learning »
Nikunj Umesh Saunshi · Arushi Gupta · Wei Hu -
2021 Spotlight: A Representation Learning Perspective on the Importance of Train-Validation Splitting in Meta-Learning »
Nikunj Umesh Saunshi · Arushi Gupta · Wei Hu -
2021 Poster: How Important is the Train-Validation Split in Meta-Learning? »
Yu Bai · Minshuo Chen · Pan Zhou · Tuo Zhao · Jason Lee · Sham Kakade · Huan Wang · Caiming Xiong -
2021 Spotlight: How Important is the Train-Validation Split in Meta-Learning? »
Yu Bai · Minshuo Chen · Pan Zhou · Tuo Zhao · Jason Lee · Sham Kakade · Huan Wang · Caiming Xiong -
2021 Poster: Bilinear Classes: A Structural Framework for Provable Generalization in RL »
Simon Du · Sham Kakade · Jason Lee · Shachar Lovett · Gaurav Mahajan · Wen Sun · Ruosong Wang -
2021 Poster: Instabilities of Offline RL with Pre-Trained Neural Representation »
Ruosong Wang · Yifan Wu · Ruslan Salakhutdinov · Sham Kakade -
2021 Spotlight: Instabilities of Offline RL with Pre-Trained Neural Representation »
Ruosong Wang · Yifan Wu · Ruslan Salakhutdinov · Sham Kakade -
2021 Oral: Bilinear Classes: A Structural Framework for Provable Generalization in RL »
Simon Du · Sham Kakade · Jason Lee · Shachar Lovett · Gaurav Mahajan · Wen Sun · Ruosong Wang -
2020 : QA for invited talk 8 Kakade »
Sham Kakade -
2020 : Invited talk 8 Kakade »
Sham Kakade -
2020 : Speaker Panel »
Csaba Szepesvari · Martha White · Sham Kakade · Gergely Neu · Shipra Agrawal · Akshay Krishnamurthy -
2020 : Exploration, Policy Gradient Methods, and the Deadly Triad - Sham Kakade »
Sham Kakade -
2020 Poster: Soft Threshold Weight Reparameterization for Learnable Sparsity »
Aditya Kusupati · Vivek Ramanujan · Raghav Somani · Mitchell Wortsman · Prateek Jain · Sham Kakade · Ali Farhadi -
2020 Poster: Calibration, Entropy Rates, and Memory in Language Models »
Mark Braverman · Xinyi Chen · Sham Kakade · Karthik Narasimhan · Cyril Zhang · Yi Zhang -
2020 Poster: On the Expressivity of Neural Networks for Deep Reinforcement Learning »
Kefan Dong · Yuping Luo · Tianhe (Kevin) Yu · Chelsea Finn · Tengyu Ma -
2020 Poster: The Implicit and Explicit Regularization Effects of Dropout »
Colin Wei · Sham Kakade · Tengyu Ma -
2020 Poster: InstaHide: Instance-hiding Schemes for Private Distributed Learning »
Yangsibo Huang · Zhao Song · Kai Li · Sanjeev Arora -
2020 Poster: A Sample Complexity Separation between Non-Convex and Convex Meta-Learning »
Nikunj Umesh Saunshi · Yi Zhang · Mikhail Khodak · Sanjeev Arora -
2020 Poster: Meta-learning for Mixed Linear Regression »
Weihao Kong · Raghav Somani · Zhao Song · Sham Kakade · Sewoong Oh -
2020 Test Of Time: Test of Time: Gaussian Process Optimization in the Bandit Settings: No Regret and Experimental Design »
Niranjan Srinivas · Andreas Krause · Sham Kakade · Matthias Seeger -
2019 : Is Optimization a sufficient language to understand Deep Learning? »
Sanjeev Arora -
2019 : Keynote by Sham Kakade: Prediction, Learning, and Memory »
Sham Kakade -
2019 Poster: A Theoretical Analysis of Contrastive Unsupervised Representation Learning »
Nikunj Umesh Saunshi · Orestis Plevrakis · Sanjeev Arora · Mikhail Khodak · Hrishikesh Khandeparkar -
2019 Poster: Online Control with Adversarial Disturbances »
Naman Agarwal · Brian Bullins · Elad Hazan · Sham Kakade · Karan Singh -
2019 Oral: A Theoretical Analysis of Contrastive Unsupervised Representation Learning »
Nikunj Umesh Saunshi · Orestis Plevrakis · Sanjeev Arora · Mikhail Khodak · Hrishikesh Khandeparkar -
2019 Oral: Online Control with Adversarial Disturbances »
Naman Agarwal · Brian Bullins · Elad Hazan · Sham Kakade · Karan Singh -
2019 Poster: Provably Efficient Maximum Entropy Exploration »
Elad Hazan · Sham Kakade · Karan Singh · Abby Van Soest -
2019 Oral: Provably Efficient Maximum Entropy Exploration »
Elad Hazan · Sham Kakade · Karan Singh · Abby Van Soest -
2019 Poster: Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks »
Sanjeev Arora · Simon Du · Wei Hu · Zhiyuan Li · Ruosong Wang -
2019 Poster: Online Meta-Learning »
Chelsea Finn · Aravind Rajeswaran · Sham Kakade · Sergey Levine -
2019 Poster: Maximum Likelihood Estimation for Learning Populations of Parameters »
Ramya Korlakai Vinayak · Weihao Kong · Gregory Valiant · Sham Kakade -
2019 Oral: Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks »
Sanjeev Arora · Simon Du · Wei Hu · Zhiyuan Li · Ruosong Wang -
2019 Oral: Maximum Likelihood Estimation for Learning Populations of Parameters »
Ramya Korlakai Vinayak · Weihao Kong · Gregory Valiant · Sham Kakade -
2019 Oral: Online Meta-Learning »
Chelsea Finn · Aravind Rajeswaran · Sham Kakade · Sergey Levine -
2018 Poster: Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator »
Maryam Fazel · Rong Ge · Sham Kakade · Mehran Mesbahi -
2018 Oral: Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator »
Maryam Fazel · Rong Ge · Sham Kakade · Mehran Mesbahi -
2018 Poster: Stronger Generalization Bounds for Deep Nets via a Compression Approach »
Sanjeev Arora · Rong Ge · Behnam Neyshabur · Yi Zhang -
2018 Oral: Stronger Generalization Bounds for Deep Nets via a Compression Approach »
Sanjeev Arora · Rong Ge · Behnam Neyshabur · Yi Zhang -
2018 Poster: On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization »
Sanjeev Arora · Nadav Cohen · Elad Hazan -
2018 Oral: On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization »
Sanjeev Arora · Nadav Cohen · Elad Hazan -
2018 Tutorial: Toward Theoretical Understanding of Deep Learning »
Sanjeev Arora -
2017 Workshop: Principled Approaches to Deep Learning »
Andrzej Pronobis · Robert Gens · Sham Kakade · Pedro Domingos -
2017 Poster: How to Escape Saddle Points Efficiently »
Chi Jin · Rong Ge · Praneeth Netrapalli · Sham Kakade · Michael Jordan -
2017 Talk: How to Escape Saddle Points Efficiently »
Chi Jin · Rong Ge · Praneeth Netrapalli · Sham Kakade · Michael Jordan -
2017 Poster: Generalization and Equilibrium in Generative Adversarial Nets (GANs) »
Sanjeev Arora · Rong Ge · Yingyu Liang · Tengyu Ma · Yi Zhang -
2017 Talk: Generalization and Equilibrium in Generative Adversarial Nets (GANs) »
Sanjeev Arora · Rong Ge · Yingyu Liang · Tengyu Ma · Yi Zhang