Timezone: »
A common strategy in modern learning systems is to learn a representation that is useful for many tasks, a.k.a. representation learning. We study this strategy in the imitation learning setting for Markov decision processes (MDPs) where multiple experts' trajectories are available. We formulate representation learning as a bi-level optimization problem where the outer" optimization tries to learn the joint representation and the
inner" optimization encodes the imitation learning setup and tries to learn task-specific parameters. We instantiate this framework for the imitation learning settings of behavior cloning and observation-alone. Theoretically, we show using our framework that representation learning can provide sample complexity benefits for imitation learning in both settings. We also provide proof-of-concept experiments to verify our theory.
Author Information
Sanjeev Arora (Princeton University and Institute for Advanced Study)
Simon Du (Institute for Advanced Study)
Sham Kakade (University of Washington)
Sham Kakade is a Washington Research Foundation Data Science Chair, with a joint appointment in the Department of Computer Science and the Department of Statistics at the University of Washington, and is a co-director for the Algorithmic Foundations of Data Science Institute. He works on the mathematical foundations of machine learning and AI. Sham's thesis helped in laying the foundations of the PAC-MDP framework for reinforcement learning. With his collaborators, his additional contributions include: one of the first provably efficient policy search methods, Conservative Policy Iteration, for reinforcement learning; developing the mathematical foundations for the widely used linear bandit models and the Gaussian process bandit models; the tensor and spectral methodologies for provable estimation of latent variable models (applicable to mixture of Gaussians, HMMs, and LDA); the first sharp analysis of the perturbed gradient descent algorithm, along with the design and analysis of numerous other convex and non-convex algorithms. He is the recipient of the IBM Goldberg best paper award (in 2007) for contributions to fast nearest neighbor search and the best paper, INFORMS Revenue Management and Pricing Section Prize (2014). He has been program chair for COLT 2011. Sham was an undergraduate at Caltech, where he studied physics and worked under the guidance of John Preskill in quantum computing. He then completed his Ph.D. in computational neuroscience at the Gatsby Unit at University College London, under the supervision of Peter Dayan. He was a postdoc at the Dept. of Computer Science, University of Pennsylvania , where he broadened his studies to include computational game theory and economics from the guidance of Michael Kearns. Sham has been a Principal Research Scientist at Microsoft Research, New England, an associate professor at the Department of Statistics, Wharton, UPenn, and an assistant professor at the Toyota Technological Institute at Chicago.
Yuping Luo (Princeton University)
Nikunj Saunshi (Princeton University)
More from the Same Authors
-
2020 Poster: Soft Threshold Weight Reparameterization for Learnable Sparsity »
Aditya Kusupati · Vivek Ramanujan · Raghav Somani · Mitchell Wortsman · Prateek Jain · Sham Kakade · Ali Farhadi -
2020 Poster: Calibration, Entropy Rates, and Memory in Language Models »
Mark Braverman · Xinyi Chen · Sham Kakade · Karthik Narasimhan · Cyril Zhang · Yi Zhang -
2020 Poster: On the Expressivity of Neural Networks for Deep Reinforcement Learning »
Kefan Dong · Yuping Luo · Tianhe Yu · Chelsea Finn · Tengyu Ma -
2020 Poster: The Implicit and Explicit Regularization Effects of Dropout »
Colin Wei · Sham Kakade · Tengyu Ma -
2020 Poster: InstaHide: Instance-hiding Schemes for Private Distributed Learning »
Yangsibo Huang · Zhao Song · Kai Li · Sanjeev Arora -
2020 Poster: A Sample Complexity Separation between Non-Convex and Convex Meta-Learning »
Nikunj Umesh Saunshi · Yi Zhang · Mikhail Khodak · Sanjeev Arora -
2020 Poster: Meta-learning for Mixed Linear Regression »
Weihao Kong · Raghav Somani · Zhao Song · Sham Kakade · Sewoong Oh -
2020 Test Of Time: Test of Time: Gaussian Process Optimization in the Bandit Settings: No Regret and Experimental Design »
Niranjan Srinivas · Andreas Krause · Sham Kakade · Matthias W Seeger -
2019 Poster: A Theoretical Analysis of Contrastive Unsupervised Representation Learning »
Nikunj Umesh Saunshi · Orestis Plevrakis · Sanjeev Arora · Mikhail Khodak · Hrishikesh Khandeparkar -
2019 Poster: Online Control with Adversarial Disturbances »
Naman Agarwal · Brian Bullins · Elad Hazan · Sham Kakade · Karan Singh -
2019 Oral: A Theoretical Analysis of Contrastive Unsupervised Representation Learning »
Nikunj Umesh Saunshi · Orestis Plevrakis · Sanjeev Arora · Mikhail Khodak · Hrishikesh Khandeparkar -
2019 Oral: Online Control with Adversarial Disturbances »
Naman Agarwal · Brian Bullins · Elad Hazan · Sham Kakade · Karan Singh -
2019 Poster: Provably Efficient Maximum Entropy Exploration »
Elad Hazan · Sham Kakade · Karan Singh · Abby Van Soest -
2019 Oral: Provably Efficient Maximum Entropy Exploration »
Elad Hazan · Sham Kakade · Karan Singh · Abby Van Soest -
2019 Poster: Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks »
Sanjeev Arora · Simon Du · Wei Hu · Zhiyuan Li · Ruosong Wang -
2019 Poster: Online Meta-Learning »
Chelsea Finn · Aravind Rajeswaran · Sham Kakade · Sergey Levine -
2019 Poster: Maximum Likelihood Estimation for Learning Populations of Parameters »
Ramya Korlakai Vinayak · Weihao Kong · Gregory Valiant · Sham Kakade -
2019 Oral: Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks »
Sanjeev Arora · Simon Du · Wei Hu · Zhiyuan Li · Ruosong Wang -
2019 Oral: Maximum Likelihood Estimation for Learning Populations of Parameters »
Ramya Korlakai Vinayak · Weihao Kong · Gregory Valiant · Sham Kakade -
2019 Oral: Online Meta-Learning »
Chelsea Finn · Aravind Rajeswaran · Sham Kakade · Sergey Levine -
2018 Poster: Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator »
Maryam Fazel · Rong Ge · Sham Kakade · Mehran Mesbahi -
2018 Oral: Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator »
Maryam Fazel · Rong Ge · Sham Kakade · Mehran Mesbahi -
2018 Poster: Stronger Generalization Bounds for Deep Nets via a Compression Approach »
Sanjeev Arora · Rong Ge · Behnam Neyshabur · Yi Zhang -
2018 Oral: Stronger Generalization Bounds for Deep Nets via a Compression Approach »
Sanjeev Arora · Rong Ge · Behnam Neyshabur · Yi Zhang -
2018 Poster: On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization »
Sanjeev Arora · Nadav Cohen · Elad Hazan -
2018 Oral: On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization »
Sanjeev Arora · Nadav Cohen · Elad Hazan -
2018 Tutorial: Toward Theoretical Understanding of Deep Learning »
Sanjeev Arora -
2017 Workshop: Principled Approaches to Deep Learning »
Andrzej Pronobis · Robert Gens · Sham Kakade · Pedro Domingos -
2017 Poster: How to Escape Saddle Points Efficiently »
Chi Jin · Rong Ge · Praneeth Netrapalli · Sham Kakade · Michael Jordan -
2017 Talk: How to Escape Saddle Points Efficiently »
Chi Jin · Rong Ge · Praneeth Netrapalli · Sham Kakade · Michael Jordan -
2017 Poster: Generalization and Equilibrium in Generative Adversarial Nets (GANs) »
Sanjeev Arora · Rong Ge · Yingyu Liang · Tengyu Ma · Yi Zhang -
2017 Talk: Generalization and Equilibrium in Generative Adversarial Nets (GANs) »
Sanjeev Arora · Rong Ge · Yingyu Liang · Tengyu Ma · Yi Zhang