Timezone: »
Estimates of predictive uncertainty are important for accurate model-based planning and reinforcement learning. However, predictive uncertainties --- especially ones derived from modern deep learning systems --- can be inaccurate and impose a bottleneck on performance. This paper explores which uncertainties are needed for model-based reinforcement learning and argues that ideal uncertainties should be calibrated, i.e. their probabilities should match empirical frequencies of predicted events. We describe a simple way to augment any model-based reinforcement learning agent with a calibrated model and show that doing so consistently improves planning, sample complexity, and exploration. On the \textsc{HalfCheetah} MuJoCo task, our system achieves state-of-the-art performance using 50\% fewer samples than the current leading approach. Our findings suggest that calibration can improve the performance of model-based reinforcement learning with minimal computational and implementation overhead.
Author Information
Ali Malik (Stanford Universtiy)
Volodymyr Kuleshov (Stanford University / Afresh)
Jiaming Song (Stanford)
Danny Nemer (Afresh Technologies)
Harlan Seymour (Afresh Technologies)
Stefano Ermon (Stanford University)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Oral: Calibrated Model-Based Deep Reinforcement Learning »
Thu. Jun 13th 04:40 -- 05:00 PM Room Hall B
More from the Same Authors
-
2022 : Transform Once: Efficient Operator Learning in Frequency Domain »
Michael Poli · Stefano Massaroli · Federico Berto · Jinkyoo Park · Tri Dao · Christopher Re · Stefano Ermon -
2023 : Calibrated Propensities for Causal Effect Estimation »
Shachi Deshpande · Volodymyr Kuleshov -
2023 : The Role of Linguistic Priors in Measuring Compositional Generalization of Vision-language Models »
Chenwei Wu · Li Li · Stefano Ermon · Patrick Haffner · Rong Ge · Zaiwei Zhang -
2023 : Parallel Sampling of Diffusion Models »
Andy Shih · Suneel Belkhale · Stefano Ermon · Dorsa Sadigh · Nima Anari -
2023 : On the Equivalence of Consistency-Type Models: Consistency Models, Consistent Diffusion Models, and Fokker-Planck Regularization »
Chieh-Hsin Lai · Yuhta Takida · Toshimitsu Uesaka · Naoki Murata · Yuki Mitsufuji · Stefano Ermon -
2023 : Parallel Sampling of Diffusion Models »
Andy Shih · Suneel Belkhale · Stefano Ermon · Dorsa Sadigh · Nima Anari -
2023 : Regularized Data Programming with Automated Bayesian Prior Selection »
Jacqueline Maasch · Hao Zhang · Qian Yang · Fei Wang · Volodymyr Kuleshov -
2023 : Direct Preference Optimization: Your Language Model is Secretly a Reward Model »
Rafael Rafailov · Archit Sharma · Eric Mitchell · Stefano Ermon · Christopher Manning · Chelsea Finn -
2023 : Invited Talk by Stefano Ermon »
Stefano Ermon -
2023 Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators »
Felix Petersen · Marco Cuturi · Mathias Niepert · Hilde Kuehne · Michael Kagan · Willie Neiswanger · Stefano Ermon -
2023 Oral: Hyena Hierarchy: Towards Larger Convolutional Language Models »
Michael Poli · Stefano Massaroli · Eric Nguyen · Daniel Y Fu · Tri Dao · Stephen Baccus · Yoshua Bengio · Stefano Ermon · Christopher Re -
2023 Poster: Geometric Latent Diffusion Models for 3D Molecule Generation »
Minkai Xu · Alexander Powers · Ron Dror · Stefano Ermon · Jure Leskovec -
2023 Poster: Reflected Diffusion Models »
Aaron Lou · Stefano Ermon -
2023 Poster: Long Horizon Temperature Scaling »
Andy Shih · Dorsa Sadigh · Stefano Ermon -
2023 Poster: InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models »
Yingheng Wang · Yair Schiff · Aaron Gokaslan · Weishen Pan · Fei Wang · Chris De Sa · Volodymyr Kuleshov -
2023 Poster: Hyena Hierarchy: Towards Larger Convolutional Language Models »
Michael Poli · Stefano Massaroli · Eric Nguyen · Daniel Y Fu · Tri Dao · Stephen Baccus · Yoshua Bengio · Stefano Ermon · Christopher Re -
2023 Poster: GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration »
Naoki Murata · Koichi Saito · Chieh-Hsin Lai · Yuhta Takida · Toshimitsu Uesaka · Yuki Mitsufuji · Stefano Ermon -
2023 Poster: FP-Diffusion: Improving Score-based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation »
Chieh-Hsin Lai · Yuhta Takida · Naoki Murata · Toshimitsu Uesaka · Yuki Mitsufuji · Stefano Ermon -
2023 Poster: Deep Latent State Space Models for Time-Series Generation »
Linqi Zhou · Michael Poli · Winnie Xu · Stefano Massaroli · Stefano Ermon -
2023 Oral: GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration »
Naoki Murata · Koichi Saito · Chieh-Hsin Lai · Yuhta Takida · Toshimitsu Uesaka · Yuki Mitsufuji · Stefano Ermon -
2023 Poster: CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations »
Gengchen Mai · Ni Lao · Yutong He · Jiaming Song · Stefano Ermon -
2023 Poster: Loss-Guided Diffusion Models for Plug-and-Play Controllable Generation »
Jiaming Song · Qinsheng Zhang · Hongxu Yin · Morteza Mardani · Ming-Yu Liu · Jan Kautz · Yongxin Chen · Arash Vahdat -
2023 Poster: Semi-Autoregressive Energy Flows: Exploring Likelihood-Free Training of Normalizing Flows »
Phillip Si · Zeyi Chen · Subham S Sahoo · Yair Schiff · Volodymyr Kuleshov -
2022 : FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness »
Tri Dao · Daniel Y Fu · Stefano Ermon · Atri Rudra · Christopher Re -
2022 : Generative Modeling with Stochastic Differential Equations »
Stefano Ermon -
2022 : Neural Geometric Embedding Flows »
Aaron Lou · Yang Song · Jiaming Song · Stefano Ermon -
2022 Workshop: Adaptive Experimental Design and Active Learning in the Real World »
Mojmir Mutny · Willie Neiswanger · Ilija Bogunovic · Stefano Ermon · Yisong Yue · Andreas Krause -
2022 Poster: Imitation Learning by Estimating Expertise of Demonstrators »
Mark Beliaev · Andy Shih · Stefano Ermon · Dorsa Sadigh · Ramtin Pedarsani -
2022 Poster: Calibrated and Sharp Uncertainties in Deep Learning via Density Estimation »
Volodymyr Kuleshov · Shachi Deshpande -
2022 Spotlight: Calibrated and Sharp Uncertainties in Deep Learning via Density Estimation »
Volodymyr Kuleshov · Shachi Deshpande -
2022 Spotlight: Imitation Learning by Estimating Expertise of Demonstrators »
Mark Beliaev · Andy Shih · Stefano Ermon · Dorsa Sadigh · Ramtin Pedarsani -
2022 Poster: A General Recipe for Likelihood-free Bayesian Optimization »
Jiaming Song · Lantao Yu · Willie Neiswanger · Stefano Ermon -
2022 Poster: Popular decision tree algorithms are provably noise tolerant »
Guy Blanc · Jane Lange · Ali Malik · Li-Yang Tan -
2022 Oral: A General Recipe for Likelihood-free Bayesian Optimization »
Jiaming Song · Lantao Yu · Willie Neiswanger · Stefano Ermon -
2022 Spotlight: Popular decision tree algorithms are provably noise tolerant »
Guy Blanc · Jane Lange · Ali Malik · Li-Yang Tan -
2022 Poster: ButterflyFlow: Building Invertible Layers with Butterfly Matrices »
Chenlin Meng · Linqi Zhou · Kristy Choi · Tri Dao · Stefano Ermon -
2022 Poster: Bit Prioritization in Variational Autoencoders via Progressive Coding »
Rui Shu · Stefano Ermon -
2022 Poster: Modular Conformal Calibration »
Charles Marx · Shengjia Zhao · Willie Neiswanger · Stefano Ermon -
2022 Spotlight: Bit Prioritization in Variational Autoencoders via Progressive Coding »
Rui Shu · Stefano Ermon -
2022 Spotlight: Modular Conformal Calibration »
Charles Marx · Shengjia Zhao · Willie Neiswanger · Stefano Ermon -
2022 Spotlight: ButterflyFlow: Building Invertible Layers with Butterfly Matrices »
Chenlin Meng · Linqi Zhou · Kristy Choi · Tri Dao · Stefano Ermon -
2021 : Invited Talk 5 (Stefano Ermon): Maximum Likelihood Training of Score-Based Diffusion Models »
Stefano Ermon -
2021 Poster: Temporal Predictive Coding For Model-Based Planning In Latent Space »
Tung Nguyen · Rui Shu · Tuan Pham · Hung Bui · Stefano Ermon -
2021 Spotlight: Temporal Predictive Coding For Model-Based Planning In Latent Space »
Tung Nguyen · Rui Shu · Tuan Pham · Hung Bui · Stefano Ermon -
2021 Poster: Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information »
Willie Neiswanger · Ke Alexander Wang · Stefano Ermon -
2021 Spotlight: Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information »
Willie Neiswanger · Ke Alexander Wang · Stefano Ermon -
2021 Poster: Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving »
Yang Song · Chenlin Meng · Renjie Liao · Stefano Ermon -
2021 Spotlight: Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving »
Yang Song · Chenlin Meng · Renjie Liao · Stefano Ermon -
2021 Poster: Reward Identification in Inverse Reinforcement Learning »
Kuno Kim · Shivam Garg · Kirankumar Shiragur · Stefano Ermon -
2021 Spotlight: Reward Identification in Inverse Reinforcement Learning »
Kuno Kim · Shivam Garg · Kirankumar Shiragur · Stefano Ermon -
2020 Poster: Predictive Coding for Locally-Linear Control »
Rui Shu · Tung Nguyen · Yinlam Chow · Tuan Pham · Khoat Than · Mohammad Ghavamzadeh · Stefano Ermon · Hung Bui -
2020 Poster: Bridging the Gap Between f-GANs and Wasserstein GANs »
Jiaming Song · Stefano Ermon -
2020 Poster: Individual Calibration with Randomized Forecasting »
Shengjia Zhao · Tengyu Ma · Stefano Ermon -
2020 Poster: Domain Adaptive Imitation Learning »
Kuno Kim · Yihong Gu · Jiaming Song · Shengjia Zhao · Stefano Ermon -
2020 Poster: Training Deep Energy-Based Models with f-Divergence Minimization »
Lantao Yu · Yang Song · Jiaming Song · Stefano Ermon -
2020 Poster: Fair Generative Modeling via Weak Supervision »
Kristy Choi · Aditya Grover · Trisha Singh · Rui Shu · Stefano Ermon -
2019 : Networking Lunch (provided) + Poster Session »
Abraham Stanway · Alex Robson · Aneesh Rangnekar · Ashesh Chattopadhyay · Ashley Pilipiszyn · Benjamin LeRoy · Bolong Cheng · Ce Zhang · Chaopeng Shen · Christian Schroeder · Christian Clough · Clement DUHART · Clement Fung · Cozmin Ududec · Dali Wang · David Dao · di wu · Dimitrios Giannakis · Dino Sejdinovic · Doina Precup · Duncan Watson-Parris · Gege Wen · George Chen · Gopal Erinjippurath · Haifeng Li · Han Zou · Herke van Hoof · Hillary A Scannell · Hiroshi Mamitsuka · Hongbao Zhang · Jaegul Choo · James Wang · James Requeima · Jessica Hwang · Jinfan Xu · Johan Mathe · Jonathan Binas · Joonseok Lee · Kalai Ramea · Kate Duffy · Kevin McCloskey · Kris Sankaran · Lester Mackey · Letif Mones · Loubna Benabbou · Lynn Kaack · Matthew Hoffman · Mayur Mudigonda · Mehrdad Mahdavi · Michael McCourt · Mingchao Jiang · Mohammad Mahdi Kamani · Neel Guha · Niccolo Dalmasso · Nick Pawlowski · Nikola Milojevic-Dupont · Paulo Orenstein · Pedram Hassanzadeh · Pekka Marttinen · Ramesh Nair · Sadegh Farhang · Samuel Kaski · Sandeep Manjanna · Sasha Luccioni · Shuby Deshpande · Soo Kim · Soukayna Mouatadid · Sunghyun Park · Tao Lin · Telmo Felgueira · Thomas Hornigold · Tianle Yuan · Tom Beucler · Tracy Cui · Volodymyr Kuleshov · Wei Yu · yang song · Ydo Wexler · Yoshua Bengio · Zhecheng Wang · Zhuangfang Yi · Zouheir Malki -
2019 : Towards a Sustainable Food Supply Chain Powered by Artificial Intelligence »
Volodymyr Kuleshov -
2019 Poster: Graphite: Iterative Generative Modeling of Graphs »
Aditya Grover · Aaron Zweig · Stefano Ermon -
2019 Poster: Adaptive Antithetic Sampling for Variance Reduction »
Hongyu Ren · Shengjia Zhao · Stefano Ermon -
2019 Oral: Adaptive Antithetic Sampling for Variance Reduction »
Hongyu Ren · Shengjia Zhao · Stefano Ermon -
2019 Oral: Graphite: Iterative Generative Modeling of Graphs »
Aditya Grover · Aaron Zweig · Stefano Ermon -
2019 Poster: Multi-Agent Adversarial Inverse Reinforcement Learning »
Lantao Yu · Jiaming Song · Stefano Ermon -
2019 Poster: Neural Joint Source-Channel Coding »
Kristy Choi · Kedar Tatwawadi · Aditya Grover · Tsachy Weissman · Stefano Ermon -
2019 Oral: Neural Joint Source-Channel Coding »
Kristy Choi · Kedar Tatwawadi · Aditya Grover · Tsachy Weissman · Stefano Ermon -
2019 Oral: Multi-Agent Adversarial Inverse Reinforcement Learning »
Lantao Yu · Jiaming Song · Stefano Ermon -
2018 Poster: Modeling Sparse Deviations for Compressed Sensing using Generative Models »
Manik Dhar · Aditya Grover · Stefano Ermon -
2018 Oral: Modeling Sparse Deviations for Compressed Sensing using Generative Models »
Manik Dhar · Aditya Grover · Stefano Ermon -
2018 Poster: Accelerating Natural Gradient with Higher-Order Invariance »
Yang Song · Jiaming Song · Stefano Ermon -
2018 Poster: Accurate Uncertainties for Deep Learning Using Calibrated Regression »
Volodymyr Kuleshov · Nathan Fenner · Stefano Ermon -
2018 Oral: Accelerating Natural Gradient with Higher-Order Invariance »
Yang Song · Jiaming Song · Stefano Ermon -
2018 Oral: Accurate Uncertainties for Deep Learning Using Calibrated Regression »
Volodymyr Kuleshov · Nathan Fenner · Stefano Ermon -
2017 Poster: Learning Hierarchical Features from Deep Generative Models »
Shengjia Zhao · Jiaming Song · Stefano Ermon -
2017 Talk: Learning Hierarchical Features from Deep Generative Models »
Shengjia Zhao · Jiaming Song · Stefano Ermon