Timezone: »
We provide a unifying view of a large family of previous imitation learning algorithms through the lens of moment matching. At its core, our classification scheme is based on whether the learner attempts to match (1) reward or (2) action-value moments of the expert's behavior, with each option leading to differing algorithmic approaches. By considering adversarially chosen divergences between learner and expert behavior, we are able to derive bounds on policy performance that apply for all algorithms in each of these classes, the first to our knowledge. We also introduce the notion of moment recoverability, implicit in many previous analyses of imitation learning, which allows us to cleanly delineate how well each algorithmic family is able to mitigate compounding errors. We derive three novel algorithm templates (AdVIL, AdRIL, and DAeQuIL) with strong guarantees, simple implementation, and competitive empirical performance.
Author Information
Gokul Swamy (Carnegie Mellon University)
Sanjiban Choudhury (Aurora)
J. Bagnell (Aurora Innovation)
Steven Wu (Carnegie Mellon University)
More from the Same Authors
-
2021 : Towards the Unification and Robustness of Perturbation and Gradient Based Explanations »
· Sushant Agarwal · Shahin Jabbari · Chirag Agarwal · Sohini Upadhyay · Steven Wu · Hima Lakkaraju -
2021 : Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses »
Keegan Harris · Dung Ngo · Logan Stapleton · Hoda Heidari · Steven Wu -
2021 : Stateful Strategic Regression »
Keegan Harris · Hoda Heidari · Steven Wu -
2021 : Iterative Methods for Private Synthetic Data: Unifying Framework and New Methods »
Terrance Liu · Giuseppe Vietri · Steven Wu -
2021 : Private Multi-Task Learning: Formulation and Applications to Federated Learning »
Shengyuan Hu · Steven Wu · Virginia Smith -
2021 : Iterative Methods for Private Synthetic Data: Unifying Framework and New Methods »
Terrance Liu · Giuseppe Vietri · Steven Wu -
2021 : Understanding Clipped FedAvg: Convergence and Client-Level Differential Privacy »
xinwei zhang · Xiangyi Chen · Steven Wu · Mingyi Hong -
2021 : Improved Privacy Filters and Odometers: Time-Uniform Bounds in Privacy Composition »
Justin Whitehouse · Aaditya Ramdas · Ryan Rogers · Steven Wu -
2021 : Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses »
Keegan Harris · Dung Ngo · Logan Stapleton · Hoda Heidari · Steven Wu -
2021 : Stateful Strategic Regression »
Keegan Harris · Hoda Heidari · Steven Wu -
2021 : Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses »
Keegan Harris · Dung Ngo · Logan Stapleton · Hoda Heidari · Steven Wu -
2021 : Scalable Algorithms for Nonlinear Causal Inference »
Gokul Swamy · Sanjiban Choudhury · James Bagnell · Steven Wu -
2021 : Of Moments and Matching: A Game-Theoretic Framework for Closing the Imitation Gap »
Gokul Swamy · Sanjiban Choudhury · James Bagnell · Steven Wu -
2022 : Meta-Learning Adversarial Bandits »
Nina Balcan · Keegan Harris · Mikhail Khodak · Steven Wu -
2023 Poster: Generating Private Synthetic Data with Genetic Algorithms »
Terrance Liu · Jingwu Tang · Giuseppe Vietri · Steven Wu -
2023 Poster: A nonparametric extension of randomized response for private confidence sets »
Ian Waudby-Smith · Steven Wu · Aaditya Ramdas -
2023 Poster: Inverse Reinforcement Learning without Reinforcement Learning »
Gokul Swamy · David Wu · Sanjiban Choudhury · J. Bagnell · Steven Wu -
2023 Poster: Fully-Adaptive Composition in Differential Privacy »
Justin Whitehouse · Aaditya Ramdas · Ryan Rogers · Steven Wu -
2023 Poster: The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms »
Anirudh Vemula · Yuda Song · Aarti Singh · J. Bagnell · Sanjiban Choudhury -
2023 Oral: A nonparametric extension of randomized response for private confidence sets »
Ian Waudby-Smith · Steven Wu · Aaditya Ramdas -
2022 Poster: Information Discrepancy in Strategic Learning »
Yahav Bechavod · Chara Podimata · Steven Wu · Juba Ziani -
2022 Poster: Constrained Variational Policy Optimization for Safe Reinforcement Learning »
Zuxin Liu · Zhepeng Cen · Vladislav Isenbaev · Wei Liu · Steven Wu · Bo Li · Ding Zhao -
2022 Poster: Causal Imitation Learning under Temporally Correlated Noise »
Gokul Swamy · Sanjiban Choudhury · James Bagnell · Steven Wu -
2022 Spotlight: Constrained Variational Policy Optimization for Safe Reinforcement Learning »
Zuxin Liu · Zhepeng Cen · Vladislav Isenbaev · Wei Liu · Steven Wu · Bo Li · Ding Zhao -
2022 Spotlight: Information Discrepancy in Strategic Learning »
Yahav Bechavod · Chara Podimata · Steven Wu · Juba Ziani -
2022 Oral: Causal Imitation Learning under Temporally Correlated Noise »
Gokul Swamy · Sanjiban Choudhury · James Bagnell · Steven Wu -
2022 Poster: Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses »
Keegan Harris · Dung Ngo · Logan Stapleton · Hoda Heidari · Steven Wu -
2022 Poster: Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning »
Alberto Bietti · Chen-Yu Wei · Miroslav Dudik · John Langford · Steven Wu -
2022 Poster: Improved Regret for Differentially Private Exploration in Linear MDP »
Dung Ngo · Giuseppe Vietri · Steven Wu -
2022 Poster: Understanding Clipping for Federated Learning: Convergence and Client-Level Differential Privacy »
xinwei zhang · Xiangyi Chen · Mingyi Hong · Steven Wu · Jinfeng Yi -
2022 Spotlight: Understanding Clipping for Federated Learning: Convergence and Client-Level Differential Privacy »
xinwei zhang · Xiangyi Chen · Mingyi Hong · Steven Wu · Jinfeng Yi -
2022 Spotlight: Improved Regret for Differentially Private Exploration in Linear MDP »
Dung Ngo · Giuseppe Vietri · Steven Wu -
2022 Spotlight: Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses »
Keegan Harris · Dung Ngo · Logan Stapleton · Hoda Heidari · Steven Wu -
2022 Spotlight: Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning »
Alberto Bietti · Chen-Yu Wei · Miroslav Dudik · John Langford · Steven Wu -
2021 : Poster »
Shiji Zhou · Nastaran Okati · Wichinpong Sinchaisri · Kim de Bie · Ana Lucic · Mina Khan · Ishaan Shah · JINGHUI LU · Andreas Kirsch · Julius Frost · Ze Gong · Gokul Swamy · Ah Young Kim · Ahmed Baruwa · Ranganath Krishnan -
2021 Poster: Leveraging Public Data for Practical Private Query Release »
Terrance Liu · Giuseppe Vietri · Thomas Steinke · Jonathan Ullman · Steven Wu -
2021 Spotlight: Leveraging Public Data for Practical Private Query Release »
Terrance Liu · Giuseppe Vietri · Thomas Steinke · Jonathan Ullman · Steven Wu -
2021 Poster: Of Moments and Matching: A Game-Theoretic Framework for Closing the Imitation Gap »
Gokul Swamy · Sanjiban Choudhury · J. Bagnell · Steven Wu -
2021 Spotlight: Of Moments and Matching: A Game-Theoretic Framework for Closing the Imitation Gap »
Gokul Swamy · Sanjiban Choudhury · J. Bagnell · Steven Wu -
2021 Poster: Towards the Unification and Robustness of Perturbation and Gradient Based Explanations »
Sushant Agarwal · Shahin Jabbari · Chirag Agarwal · Sohini Upadhyay · Steven Wu · Hima Lakkaraju -
2021 Poster: Incentivizing Compliance with Algorithmic Instruments »
Dung Ngo · Logan Stapleton · Vasilis Syrgkanis · Steven Wu -
2021 Spotlight: Incentivizing Compliance with Algorithmic Instruments »
Dung Ngo · Logan Stapleton · Vasilis Syrgkanis · Steven Wu -
2021 Spotlight: Towards the Unification and Robustness of Perturbation and Gradient Based Explanations »
Sushant Agarwal · Shahin Jabbari · Chirag Agarwal · Sohini Upadhyay · Steven Wu · Hima Lakkaraju