Timezone: »
When learning policies for real-world domains, two important questions arise: (i) how to efficiently use pre-collected off-policy, non-optimal behavior data; and (ii) how to mediate among different competing objectives and constraints. We thus study the problem of batch policy learning under multiple constraints, and offer a systematic solution. We first propose a flexible meta-algorithm that admits any batch reinforcement learning and online learning procedure as subroutines. We then present a specific algorithmic instantiation and provide performance guarantees for the main objective and all constraints. As part of off-policy learning, we propose a simple method for off-policy policy evaluation (OPE) and derive PAC-style bounds. Our algorithm achieves strong empirical results in different domains, including in a challenging problem of simulated car driving subject to multiple constraints such as lane keeping and smooth driving. We also show experimentally that our OPE method outperforms other popular OPE techniques on a standalone basis, especially in a high-dimensional setting.
Author Information
Hoang Le (Caltech)
Hoang M. Le is a PhD Candidate in the Computing and Mathematical Sciences Department at the California Institute of Technology. He received a M.S. in Cognitive Systems and Interactive Media from the Universitat Pompeu Fabra, Barcelona, Spain, and a B.A. in Mathematics from Bucknell University in Lewisburg, PA. He is a recipient of an Amazon AI Fellowship. Hoang’s research focuses on the theory and applications of sequential decision making, with a strong focus on imitation learning. He has broad familiarity with the latest advances in imitation learning techniques and applications. His own research in imitation learning blends principled new techniques with a diverse range of application domains. In addition to popular reinforcement learning domains such as maze navigation and Atari games, his prior work on imitation learning has been applied to learning human behavior in team sports and developing automatic camera broadcasting system.
Cameron Voloshin (Caltech)
Yisong Yue (Caltech)
Yisong Yue is an assistant professor in the Computing and Mathematical Sciences Department at the California Institute of Technology. He was previously a research scientist at Disney Research. Before that, he was a postdoctoral researcher in the Machine Learning Department and the iLab at Carnegie Mellon University. He received a Ph.D. from Cornell University and a B.S. from the University of Illinois at Urbana-Champaign. Yisong's research interests lie primarily in the theory and application of statistical machine learning. He is particularly interested in developing novel methods for interactive machine learning and structured prediction. In the past, his research has been applied to information retrieval, recommender systems, text classification, learning from rich user interfaces, analyzing implicit human feedback, data-driven animation, behavior analysis, sports analytics, policy learning in robotics, and adaptive planning & allocation problems.
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Oral: Batch Policy Learning under Constraints »
Thu. Jun 13th 04:00 -- 04:20 PM Room Hall B
More from the Same Authors
-
2022 Workshop: Adaptive Experimental Design and Active Learning in the Real World »
Mojmir Mutny · Willie Neiswanger · Ilija Bogunovic · Stefano Ermon · Yisong Yue · Andreas Krause -
2022 Poster: Investigating Generalization by Controlling Normalized Margin »
Alexander Farhang · Jeremy Bernstein · Kushal Tirumala · Yang Liu · Yisong Yue -
2022 Spotlight: Investigating Generalization by Controlling Normalized Margin »
Alexander Farhang · Jeremy Bernstein · Kushal Tirumala · Yang Liu · Yisong Yue -
2022 Poster: LyaNet: A Lyapunov Framework for Training Neural ODEs »
Ivan Dario Jimenez Rodriguez · Aaron Ames · Yisong Yue -
2022 Spotlight: LyaNet: A Lyapunov Framework for Training Neural ODEs »
Ivan Dario Jimenez Rodriguez · Aaron Ames · Yisong Yue -
2021 : Personalized Preference Learning - from Spinal Cord Stimulation to Exoskeletons »
Yisong Yue -
2021 Poster: Learning by Turning: Neural Architecture Aware Optimisation »
Yang Liu · Jeremy Bernstein · Markus Meister · Yisong Yue -
2021 Spotlight: Learning by Turning: Neural Architecture Aware Optimisation »
Yang Liu · Jeremy Bernstein · Markus Meister · Yisong Yue -
2020 Workshop: Real World Experiment Design and Active Learning »
Ilija Bogunovic · Willie Neiswanger · Yisong Yue -
2020 Poster: Learning Calibratable Policies using Programmatic Style-Consistency »
Eric Zhan · Albert Tseng · Yisong Yue · Adith Swaminathan · Matthew Hausknecht -
2020 Poster: Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis »
Jung Yeon Park · Kenneth Carr · Stephan Zheng · Yisong Yue · Rose Yu -
2019 Workshop: Real-world Sequential Decision Making: Reinforcement Learning and Beyond »
Hoang Le · Yisong Yue · Adith Swaminathan · Byron Boots · Ching-An Cheng -
2019 Poster: Control Regularization for Reduced Variance Reinforcement Learning »
Richard Cheng · Abhinav Verma · Gabor Orosz · Swarat Chaudhuri · Yisong Yue · Joel Burdick -
2019 Oral: Control Regularization for Reduced Variance Reinforcement Learning »
Richard Cheng · Abhinav Verma · Gabor Orosz · Swarat Chaudhuri · Yisong Yue · Joel Burdick -
2018 Poster: Iterative Amortized Inference »
Joe Marino · Yisong Yue · Stephan Mandt -
2018 Poster: Hierarchical Imitation and Reinforcement Learning »
Hoang Le · Nan Jiang · Alekh Agarwal · Miroslav Dudik · Yisong Yue · Hal Daumé III -
2018 Oral: Iterative Amortized Inference »
Joe Marino · Yisong Yue · Stephan Mandt -
2018 Oral: Hierarchical Imitation and Reinforcement Learning »
Hoang Le · Nan Jiang · Alekh Agarwal · Miroslav Dudik · Yisong Yue · Hal Daumé III -
2018 Poster: Stagewise Safe Bayesian Optimization with Gaussian Processes »
Yanan Sui · Vincent Zhuang · Joel Burdick · Yisong Yue -
2018 Oral: Stagewise Safe Bayesian Optimization with Gaussian Processes »
Yanan Sui · Vincent Zhuang · Joel Burdick · Yisong Yue -
2018 Tutorial: Imitation Learning »
Yisong Yue · Hoang Le -
2017 Poster: Coordinated Multi-Agent Imitation Learning »
Hoang Le · Yisong Yue · Peter Carr · Patrick Lucey -
2017 Talk: Coordinated Multi-Agent Imitation Learning »
Hoang Le · Yisong Yue · Peter Carr · Patrick Lucey