Timezone: »
In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress. But even when a model is imperfect, it may still contain information that is useful for planning. In this paper, we investigate the idea of using an imperfect model selectively. The agent should plan in parts of the state space where the model would be helpful but refrain from using the model where it would be harmful. An effective selective planning mechanism requires estimating predictive uncertainty, which arises out of aleatoric uncertainty, parameter uncertainty, and model inadequacy, among other sources. Prior work has focused on parameter uncertainty for selective planning. In this work, we emphasize the importance of model inadequacy. We show that heteroscedastic regression can signal predictive uncertainty arising from model inadequacy that is complementary to that which is detected by methods designed for parameter uncertainty, indicating that considering both parameter uncertainty and model inadequacy may be a more promising direction for effective selective planning than either in isolation.
Author Information
Zaheer Abbas (University of Alberta)
Samuel Sokota (University of Alberta)
Erin Talvitie (Harvey Mudd College)
Erin Talvitie is an associate professor of Computer Science at Harvey Mudd College. She graduated from Oberlin College in 2004 with majors in Computer Science and Mathematics and received her Ph.D. in Artificial Intelligence from the University of Michigan in 2010. She was a founding member of the Department of Computer Science at Franklin & Marshall College before moving on to Harvey Mudd College in 2019. Her research interests focus on model-based reinforcement learning -- specifically scaling model-based approaches up to complex, high-dimensional problems -- with he aim of working toward artificial autonomous agents that can learn to act flexibly and competently in unknown environments. She is the recipient of an NSF Graduate Research Fellowship, an NSF CAREER grant, outstanding reviewer awards from AAAI and NeurIPS, a best paper nomination from AAMAS, and a best paper award from RLDM.
Martha White (University of Alberta)
More from the Same Authors
-
2023 Poster: Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning »
Brett Daley · Martha White · Christopher Amato · Marlos C. Machado -
2022 : A Model-Based Reinforcement Learning Wishlist »
Erin Talvitie -
2022 Poster: A Temporal-Difference Approach to Policy Gradient Estimation »
Samuele Tosatto · Andrew Patterson · Martha White · A. Mahmood -
2022 Spotlight: A Temporal-Difference Approach to Policy Gradient Estimation »
Samuele Tosatto · Andrew Patterson · Martha White · A. Mahmood -
2020 : Panel Discussion »
Eric Eaton · Martha White · Doina Precup · Irina Rish · Harm van Seijen -
2020 : QA for invited talk 5 White »
Martha White -
2020 : Invited talk 5 White »
Martha White -
2020 : An Off-policy Policy Gradient Theorem: A Tale About Weightings - Martha White »
Martha White -
2020 : Speaker Panel »
Csaba Szepesvari · Martha White · Sham Kakade · Gergely Neu · Shipra Agrawal · Akshay Krishnamurthy -
2020 Poster: Gradient Temporal-Difference Learning with Regularized Corrections »
Sina Ghiassian · Andrew Patterson · Shivam Garg · Dhawal Gupta · Adam White · Martha White -
2020 Poster: Optimizing for the Future in Non-Stationary MDPs »
Yash Chandak · Georgios Theocharous · Shiv Shankar · Martha White · Sridhar Mahadevan · Philip Thomas -
2019 Workshop: Exploration in Reinforcement Learning Workshop »
Benjamin Eysenbach · Benjamin Eysenbach · Surya Bhupatiraju · Shixiang Gu · Harrison Edwards · Martha White · Pierre-Yves Oudeyer · Kenneth Stanley · Emma Brunskill -
2018 Poster: Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control »
Yangchen Pan · Amir-massoud Farahmand · Martha White · Saleh Nabi · Piyush Grover · Daniel Nikovski -
2018 Oral: Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control »
Yangchen Pan · Amir-massoud Farahmand · Martha White · Saleh Nabi · Piyush Grover · Daniel Nikovski -
2018 Poster: Improving Regression Performance with Distributional Losses »
Ehsan Imani · Martha White -
2018 Oral: Improving Regression Performance with Distributional Losses »
Ehsan Imani · Martha White