Timezone: »
When learning policies for real-world domains, two important questions arise: (i) how to efficiently use existing off-line, off-policy, non-optimal behavior data; and (ii) how to mediate among different competing objectives and constraints. We study the problem of batch policy learning under multiple constraints and offer a systematic solution. We first propose a flexible meta algorithm that admits any batch reinforcement learning and online learning procedure as subroutines. We then present a specific algorithmic instantiation and provide performance guarantees for the main objective and all constraints. To certify constraint satisfaction, we propose a new and simple method for off-policy policy evaluation (OPE) and derive PAC-style bounds. Our algorithm achieves strong empirical results in different domains, including in a challenging problem of simulated car driving subject to lane keeping and smooth driving constraints. We also show experimentally that our OPE method outperforms other popular OPE techniques on a standalone basis, especially in a high-dimensional setting.
Author Information
Hoang Le (Caltech)
Hoang M. Le is a PhD Candidate in the Computing and Mathematical Sciences Department at the California Institute of Technology. He received a M.S. in Cognitive Systems and Interactive Media from the Universitat Pompeu Fabra, Barcelona, Spain, and a B.A. in Mathematics from Bucknell University in Lewisburg, PA. He is a recipient of an Amazon AI Fellowship. Hoang’s research focuses on the theory and applications of sequential decision making, with a strong focus on imitation learning. He has broad familiarity with the latest advances in imitation learning techniques and applications. His own research in imitation learning blends principled new techniques with a diverse range of application domains. In addition to popular reinforcement learning domains such as maze navigation and Atari games, his prior work on imitation learning has been applied to learning human behavior in team sports and developing automatic camera broadcasting system.
Cameron Voloshin (Caltech)
Yisong Yue (Caltech)

Yisong Yue is a Professor of Computing and Mathematical Sciences at Caltech and (via sabbatical) a Principal Scientist at Latitude AI. His research interests span both fundamental and applied pursuits, from novel learning-theoretic frameworks all the way to deep learning deployed in autonomous driving on public roads. His work has been recognized with multiple paper awards and nominations, including in robotics, computer vision, sports analytics, machine learning for health, and information retrieval. At Latitude AI, he is working on machine learning approaches to motion planning for autonomous driving.
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Batch Policy Learning under Constraints »
Fri. Jun 14th 01:30 -- 04:00 AM Room Pacific Ballroom #31
More from the Same Authors
-
2023 : Preferential Multi-Attribute Bayesian Optimization with Application to Exoskeleton Personalization »
Raul Astudillo · Amy Li · Maegan Tucker · Chu Xin Cheng · Aaron Ames · Yisong Yue -
2023 : Dueling Bandits for Online Preference Learning »
Yisong Yue -
2023 Poster: Learning Regions of Interest for Bayesian Optimization with Adaptive Level-Set Estimation »
Fengxue Zhang · Jialin Song · James Bowden · Alexander Ladd · Yisong Yue · Thomas Desautels · Yuxin Chen -
2023 Poster: MABe22: A Multi-Species Multi-Task Benchmark for Learned Representations of Behavior »
Jennifer J. Sun · Markus Marks · Andrew Ulmer · Dipam Chakraborty · Brian Geuther · Edward Hayes · Heng Jia · Vivek Kumar · Sebastian Oleszko · Zachary Partridge · Milan Peelman · Alice Robie · Catherine Schretter · Keith Sheppard · Chao Sun · Param Uttarwar · Julian Wagner · Erik Werner · Joseph Parker · Pietro Perona · Yisong Yue · Kristin Branson · Ann Kennedy -
2023 Poster: Eventual Discounting Temporal Logic Counterfactual Experience Replay »
Cameron Voloshin · Abhinav Verma · Yisong Yue -
2022 Workshop: Adaptive Experimental Design and Active Learning in the Real World »
Mojmir Mutny · Willie Neiswanger · Ilija Bogunovic · Stefano Ermon · Yisong Yue · Andreas Krause -
2022 Poster: Investigating Generalization by Controlling Normalized Margin »
Alexander Farhang · Jeremy Bernstein · Kushal Tirumala · Yang Liu · Yisong Yue -
2022 Spotlight: Investigating Generalization by Controlling Normalized Margin »
Alexander Farhang · Jeremy Bernstein · Kushal Tirumala · Yang Liu · Yisong Yue -
2022 Poster: LyaNet: A Lyapunov Framework for Training Neural ODEs »
Ivan Dario Jimenez Rodriguez · Aaron Ames · Yisong Yue -
2022 Spotlight: LyaNet: A Lyapunov Framework for Training Neural ODEs »
Ivan Dario Jimenez Rodriguez · Aaron Ames · Yisong Yue -
2021 : Personalized Preference Learning - from Spinal Cord Stimulation to Exoskeletons »
Yisong Yue -
2021 Poster: Learning by Turning: Neural Architecture Aware Optimisation »
Yang Liu · Jeremy Bernstein · Markus Meister · Yisong Yue -
2021 Spotlight: Learning by Turning: Neural Architecture Aware Optimisation »
Yang Liu · Jeremy Bernstein · Markus Meister · Yisong Yue -
2020 Workshop: Real World Experiment Design and Active Learning »
Ilija Bogunovic · Willie Neiswanger · Yisong Yue -
2020 Poster: Learning Calibratable Policies using Programmatic Style-Consistency »
Eric Zhan · Albert Tseng · Yisong Yue · Adith Swaminathan · Matthew Hausknecht -
2020 Poster: Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis »
Jung Yeon Park · Kenneth Carr · Stephan Zheng · Yisong Yue · Rose Yu -
2019 Workshop: Real-world Sequential Decision Making: Reinforcement Learning and Beyond »
Hoang Le · Yisong Yue · Adith Swaminathan · Byron Boots · Ching-An Cheng -
2019 Poster: Control Regularization for Reduced Variance Reinforcement Learning »
Richard Cheng · Abhinav Verma · Gabor Orosz · Swarat Chaudhuri · Yisong Yue · Joel Burdick -
2019 Oral: Control Regularization for Reduced Variance Reinforcement Learning »
Richard Cheng · Abhinav Verma · Gabor Orosz · Swarat Chaudhuri · Yisong Yue · Joel Burdick -
2018 Poster: Iterative Amortized Inference »
Joe Marino · Yisong Yue · Stephan Mandt -
2018 Poster: Hierarchical Imitation and Reinforcement Learning »
Hoang Le · Nan Jiang · Alekh Agarwal · Miroslav Dudik · Yisong Yue · Hal Daumé III -
2018 Oral: Iterative Amortized Inference »
Joe Marino · Yisong Yue · Stephan Mandt -
2018 Oral: Hierarchical Imitation and Reinforcement Learning »
Hoang Le · Nan Jiang · Alekh Agarwal · Miroslav Dudik · Yisong Yue · Hal Daumé III -
2018 Poster: Stagewise Safe Bayesian Optimization with Gaussian Processes »
Yanan Sui · Vincent Zhuang · Joel Burdick · Yisong Yue -
2018 Oral: Stagewise Safe Bayesian Optimization with Gaussian Processes »
Yanan Sui · Vincent Zhuang · Joel Burdick · Yisong Yue -
2018 Tutorial: Imitation Learning »
Yisong Yue · Hoang Le -
2017 Poster: Coordinated Multi-Agent Imitation Learning »
Hoang Le · Yisong Yue · Peter Carr · Patrick Lucey -
2017 Talk: Coordinated Multi-Agent Imitation Learning »
Hoang Le · Yisong Yue · Peter Carr · Patrick Lucey