Timezone: »
Spotlight
A Lower Bound for the Sample Complexity of Inverse Reinforcement Learning
Abi Komanduru · Jean Honorio
Inverse reinforcement learning (IRL) is the task of finding a reward function that generates a desired optimal policy for a given Markov Decision Process (MDP). This paper develops an information-theoretic lower bound for the sample complexity of the finite state, finite action IRL problem. A geometric construction of $\beta$-strict separable IRL problems using spherical codes is considered. Properties of the ensemble size as well as the Kullback-Leibler divergence between the generated trajectories are derived. The resulting ensemble is then used along with Fano's inequality to derive a sample complexity lower bound of $O(n \log n)$, where $n$ is the number of states in the MDP.
Author Information
Abi Komanduru (Purdue University)
Jean Honorio (Purdue University)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: A Lower Bound for the Sample Complexity of Inverse Reinforcement Learning »
Wed. Jul 21st 04:00 -- 06:00 PM Room
More from the Same Authors
-
2023 Poster: Exact Inference in High-order Structured Prediction »
Chuyang Ke · Jean Honorio -
2022 Poster: A Simple Unified Framework for High Dimensional Bandit Problems »
Wenjie Li · Adarsh Barik · Jean Honorio -
2022 Spotlight: A Simple Unified Framework for High Dimensional Bandit Problems »
Wenjie Li · Adarsh Barik · Jean Honorio -
2022 Poster: Sparse Mixed Linear Regression with Guarantees: Taming an Intractable Problem with Invex Relaxation »
Adarsh Barik · Jean Honorio -
2022 Spotlight: Sparse Mixed Linear Regression with Guarantees: Taming an Intractable Problem with Invex Relaxation »
Adarsh Barik · Jean Honorio -
2021 Poster: Meta Learning for Support Recovery in High-dimensional Precision Matrix Estimation »
Qian Zhang · Yilin Zheng · Jean Honorio -
2021 Spotlight: Meta Learning for Support Recovery in High-dimensional Precision Matrix Estimation »
Qian Zhang · Yilin Zheng · Jean Honorio -
2019 Poster: Optimality Implies Kernel Sum Classifiers are Statistically Efficient »
Raphael Meyer · Jean Honorio -
2019 Oral: Optimality Implies Kernel Sum Classifiers are Statistically Efficient »
Raphael Meyer · Jean Honorio -
2018 Poster: Learning Maximum-A-Posteriori Perturbation Models for Structured Prediction in Polynomial Time »
Asish Ghoshal · Jean Honorio -
2018 Oral: Learning Maximum-A-Posteriori Perturbation Models for Structured Prediction in Polynomial Time »
Asish Ghoshal · Jean Honorio