Timezone: »

Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning
Sebastian Curi · Ilija Bogunovic · Andreas Krause

Wed Jul 21 09:00 AM -- 11:00 AM (PDT) @ Virtual #None

In real-world tasks, reinforcement learning (RL) agents frequently encounter situations that are not present during training time. To ensure reliable performance, the RL agents need to exhibit robustness to such worst-case situations. The robust-RL framework addresses this challenge via a minimax optimization between an agent and an adversary. Previous robust RL algorithms are either sample inefficient, lack robustness guarantees, or do not scale to large problems. We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem while attaining near-optimal sample complexity guarantees. RH-UCRL is a model-based reinforcement learning (MBRL) algorithm that effectively distinguishes between epistemic and aleatoric uncertainty and efficiently explores both the agent and the adversary decision spaces during policy learning. We scale RH-UCRL to complex tasks via neural networks ensemble models as well as neural network policies. Experimentally we demonstrate that RH-UCRL outperforms other robust deep RL algorithms in a variety of adversarial environments.

Author Information

Sebastian Curi (ETH)
Ilija Bogunovic (ETH Zurich)
Andreas Krause (ETH Zurich)

Andreas Krause is a Professor of Computer Science at ETH Zurich, where he leads the Learning & Adaptive Systems Group. He also serves as Academic Co-Director of the Swiss Data Science Center. Before that he was an Assistant Professor of Computer Science at Caltech. He received his Ph.D. in Computer Science from Carnegie Mellon University (2008) and his Diplom in Computer Science and Mathematics from the Technical University of Munich, Germany (2004). He is a Microsoft Research Faculty Fellow and a Kavli Frontiers Fellow of the US National Academy of Sciences. He received ERC Starting Investigator and ERC Consolidator grants, the Deutscher Mustererkennungspreis, an NSF CAREER award, the Okawa Foundation Research Grant recognizing top young researchers in telecommunications as well as the ETH Golden Owl teaching award. His research on machine learning and adaptive systems has received awards at several premier conferences and journals, including the ACM SIGKDD Test of Time award 2019 and the ICML Test of Time award 2020. Andreas Krause served as Program Co-Chair for ICML 2018, and is regularly serving as Area Chair or Senior Program Committee member for ICML, NeurIPS, AAAI and IJCAI, and as Action Editor for the Journal of Machine Learning Research.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors