Timezone: »
We consider the Bayesian theory of mind (BTOM) framework for learning from demonstrations via inverse reinforcement learning (IRL). The BTOM model consists of a joint representation of the agent’s reward function and the agent's internal subjective model of the environment dynamics, which may be inaccurate. In this paper, we make use of a class of prior distributions that parametrize how accurate is the agent’s model of the environment to develop efficient algorithms to estimate the agent's reward and subjective dynamics in high-dimensional settings. The BTOM framework departs from existing offline model-based IRL approaches by performing simultaneous estimation of reward and dynamics. Our analysis reveals a novel insight that the estimated policy exhibits robust performance when the (expert) agent is believed (a priori) to have a highly accurate model of the environment. We verify this observation in the MuJoCo environment and show that our algorithms outperform state-of-the-art offline IRL algorithms.
Author Information
Ran Wei (Texas A&M University - College Station)
Siliang Zeng (University of Minnesota, Twin Cities)
Chenliang Li (The Chinese University of Hong Kong)
Alfredo Garcia (Texas A&M University)
Anthony McDonald
Mingyi Hong (University of Minnesota)
More from the Same Authors
-
2021 : Understanding Clipped FedAvg: Convergence and Client-Level Differential Privacy »
xinwei zhang · Xiangyi Chen · Steven Wu · Mingyi Hong -
2022 : Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees »
Siliang Zeng · Chenliang Li · Alfredo Garcia · Mingyi Hong -
2023 Poster: Linearly Constrained Bilevel Optimization: A Smoothed Implicit Gradient Approach »
Prashant Khanduri · Ioannis Tsaknakis · Yihua Zhang · Jia Liu · Sijia Liu · Jiawei Zhang · Mingyi Hong -
2023 Poster: Understanding Backdoor Attacks through the Adaptability Hypothesis »
Xun Xian · Ganghua Wang · Jayanth Srinivasa · Ashish Kundu · Xuan Bi · Mingyi Hong · Jie Ding -
2023 Poster: FedAvg Converges to Zero Training Loss Linearly for Overparameterized Multi-Layer Neural Networks »
Bingqing Song · Prashant Khanduri · xinwei zhang · Jinfeng Yi · Mingyi Hong -
2022 Poster: A Stochastic Multi-Rate Control Framework For Modeling Distributed Optimization Algorithms »
xinwei zhang · Mingyi Hong · Sairaj Dhople · Nicola Elia -
2022 Spotlight: A Stochastic Multi-Rate Control Framework For Modeling Distributed Optimization Algorithms »
xinwei zhang · Mingyi Hong · Sairaj Dhople · Nicola Elia -
2022 Poster: Understanding Clipping for Federated Learning: Convergence and Client-Level Differential Privacy »
xinwei zhang · Xiangyi Chen · Mingyi Hong · Steven Wu · Jinfeng Yi -
2022 Poster: Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization »
Yihua Zhang · Guanhua Zhang · Prashant Khanduri · Mingyi Hong · Shiyu Chang · Sijia Liu -
2022 Spotlight: Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization »
Yihua Zhang · Guanhua Zhang · Prashant Khanduri · Mingyi Hong · Shiyu Chang · Sijia Liu -
2022 Spotlight: Understanding Clipping for Federated Learning: Convergence and Client-Level Differential Privacy »
xinwei zhang · Xiangyi Chen · Mingyi Hong · Steven Wu · Jinfeng Yi -
2021 Spotlight: Decentralized Riemannian Gradient Descent on the Stiefel Manifold »
Shixiang Chen · Alfredo Garcia · Mingyi Hong · Shahin Shahrampour -
2021 Poster: Decentralized Riemannian Gradient Descent on the Stiefel Manifold »
Shixiang Chen · Alfredo Garcia · Mingyi Hong · Shahin Shahrampour -
2020 Poster: Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: Joint Gradient Estimation and Tracking »
Haoran Sun · Songtao Lu · Mingyi Hong -
2020 Poster: Min-Max Optimization without Gradients: Convergence and Applications to Black-Box Evasion and Poisoning Attacks »
Sijia Liu · Songtao Lu · Xiangyi Chen · Yao Feng · Kaidi Xu · Abdullah Al-Dujaili · Mingyi Hong · Una-May O'Reilly -
2019 Poster: PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization »
Songtao Lu · Mingyi Hong · Zhengdao Wang -
2019 Oral: PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization »
Songtao Lu · Mingyi Hong · Zhengdao Wang -
2018 Poster: Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solution for Nonconvex Distributed Optimization Over Networks »
Mingyi Hong · Meisam Razaviyayn · Jason Lee -
2018 Oral: Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solution for Nonconvex Distributed Optimization Over Networks »
Mingyi Hong · Meisam Razaviyayn · Jason Lee