Timezone: »

 
Poster
Batch Policy Learning under Constraints
Hoang Le · Cameron Voloshin · Yisong Yue

Thu Jun 13 06:30 PM -- 09:00 PM (PDT) @ Pacific Ballroom #31

When learning policies for real-world domains, two important questions arise: (i) how to efficiently use pre-collected off-policy, non-optimal behavior data; and (ii) how to mediate among different competing objectives and constraints. We thus study the problem of batch policy learning under multiple constraints, and offer a systematic solution. We first propose a flexible meta-algorithm that admits any batch reinforcement learning and online learning procedure as subroutines. We then present a specific algorithmic instantiation and provide performance guarantees for the main objective and all constraints. As part of off-policy learning, we propose a simple method for off-policy policy evaluation (OPE) and derive PAC-style bounds. Our algorithm achieves strong empirical results in different domains, including in a challenging problem of simulated car driving subject to multiple constraints such as lane keeping and smooth driving. We also show experimentally that our OPE method outperforms other popular OPE techniques on a standalone basis, especially in a high-dimensional setting.

Author Information

Hoang Le (Caltech)

Hoang M. Le is a PhD Candidate in the Computing and Mathematical Sciences Department at the California Institute of Technology. He received a M.S. in Cognitive Systems and Interactive Media from the Universitat Pompeu Fabra, Barcelona, Spain, and a B.A. in Mathematics from Bucknell University in Lewisburg, PA. He is a recipient of an Amazon AI Fellowship. Hoang’s research focuses on the theory and applications of sequential decision making, with a strong focus on imitation learning. He has broad familiarity with the latest advances in imitation learning techniques and applications. His own research in imitation learning blends principled new techniques with a diverse range of application domains. In addition to popular reinforcement learning domains such as maze navigation and Atari games, his prior work on imitation learning has been applied to learning human behavior in team sports and developing automatic camera broadcasting system.

Cameron Voloshin (Caltech)
Yisong Yue (Caltech)
Yisong Yue

Yisong Yue is a Professor of Computing and Mathematical Sciences at Caltech and (via sabbatical) a Principal Scientist at Latitude AI. His research interests span both fundamental and applied pursuits, from novel learning-theoretic frameworks all the way to deep learning deployed in autonomous driving on public roads. His work has been recognized with multiple paper awards and nominations, including in robotics, computer vision, sports analytics, machine learning for health, and information retrieval. At Latitude AI, he is working on machine learning approaches to motion planning for autonomous driving.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors