Timezone: »
When the agent's observations or interactions are delayed, classic reinforcement learning tools usually fail. In this paper, we propose a simple yet new and efficient solution to this problem. We assume that, in the undelayed environment, an efficient policy is known or can be easily learnt, but the task may suffer from delays in practice and we thus want to take them into account. We present a novel algorithm, Delayed Imitation with Dataset Aggregation (DIDA), which builds upon imitation learning methods to learn how to act in a delayed environment from undelayed demonstrations. We provide a theoretical analysis of the approach that will guide the practical design of DIDA. These results are also of general interest in the delayed reinforcement learning literature by providing bounds on the performance between delayed and undelayed tasks, under smoothness conditions. We show empirically that DIDA obtains high performances with a remarkable sample efficiency on a variety of tasks, including robotic locomotion, classic control, and trading.
Author Information
Pierre Liotet (Politecnico di Milano)
Currently a Ph.D. student at Politecnico di Milano. My research interests include reinforcement learning in environments with partial observability, delays and the lifelong reinforcement learning setting.
Davide Maran (Politecnico di Milano)
Lorenzo Bisi (Politecnico di Milano)
Marcello Restelli (Politecnico di Milano)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: Delayed Reinforcement Learning by Imitation »
Wed. Jul 20th 09:40 -- 09:45 PM Room Hall G
More from the Same Authors
-
2021 : Meta Learning the Step Size in Policy Gradient Methods »
Luca Sabbioni · Francesco Corda · Marcello Restelli -
2021 : Subgaussian Importance Sampling for Off-Policy Evaluation and Learning »
Alberto Maria Metelli · Alessio Russo · Marcello Restelli -
2021 : The Importance of Non-Markovianity in Maximum State Entropy Exploration »
Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2021 : Efficient Inverse Reinforcement Learning of Transferable Rewards »
Giorgia Ramponi · Alberto Maria Metelli · Marcello Restelli -
2021 : Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection »
Matteo Papini · Andrea Tirinzoni · Aldo Pacchiano · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2021 : Reward-Free Policy Space Compression for Reinforcement Learning »
Mirco Mutti · Stefano Del Col · Marcello Restelli -
2021 : Learning to Explore Multiple Environments without Rewards »
Mirco Mutti · Mattia Mancassola · Marcello Restelli -
2021 : The Importance of Non-Markovianity in Maximum State Entropy Exploration »
Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2022 : Challenging Common Assumptions in Convex Reinforcement Learning »
Mirco Mutti · Riccardo De Santi · Piersilvio De Bartolomeis · Marcello Restelli -
2022 : Stochastic Rising Bandits for Online Model Selection »
Alberto Maria Metelli · Francesco Trovò · Matteo Pirola · Marcello Restelli -
2022 : Dynamical Linear Bandits for Long-Lasting Vanishing Rewards »
Marco Mussi · Alberto Maria Metelli · Marcello Restelli -
2022 : Invariance Discovery for Systematic Generalization in Reinforcement Learning »
Mirco Mutti · Riccardo De Santi · Emanuele Rossi · Juan Calderon · Michael Bronstein · Marcello Restelli -
2022 : Recursive History Representations for Unsupervised Reinforcement Learning in Multiple-Environments »
Mirco Mutti · Pietro Maldini · Riccardo De Santi · Marcello Restelli -
2022 : Directed Exploration via Uncertainty-Aware Critics »
Amarildo Likmeta · Matteo Sacco · Alberto Maria Metelli · Marcello Restelli -
2022 : Non-Markovian Policies for Unsupervised Reinforcement Learning in Multiple Environments »
Pietro Maldini · Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2023 : A Best Arm Identification Approach for Stochastic Rising Bandits »
Alessandro Montenegro · Marco Mussi · Francesco Trovò · Marcello Restelli · Alberto Maria Metelli -
2023 : Parameterized projected Bellman operator »
Théo Vincent · Alberto Maria Metelli · Jan Peters · Marcello Restelli · Carlo D'Eramo -
2023 Poster: Dynamical Linear Bandits »
Marco Mussi · Alberto Maria Metelli · Marcello Restelli -
2023 Oral: Towards Theoretical Understanding of Inverse Reinforcement Learning »
Alberto Maria Metelli · Filippo Lazzati · Marcello Restelli -
2023 Poster: Towards Theoretical Understanding of Inverse Reinforcement Learning »
Alberto Maria Metelli · Filippo Lazzati · Marcello Restelli -
2023 Poster: Truncating Trajectories in Monte Carlo Reinforcement Learning »
Riccardo Poiani · Alberto Maria Metelli · Marcello Restelli -
2022 Poster: The Importance of Non-Markovianity in Maximum State Entropy Exploration »
Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2022 Poster: Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning »
Angelo Damiani · Giorgio Manganini · Alberto Maria Metelli · Marcello Restelli -
2022 Spotlight: Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning »
Angelo Damiani · Giorgio Manganini · Alberto Maria Metelli · Marcello Restelli -
2022 Oral: The Importance of Non-Markovianity in Maximum State Entropy Exploration »
Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2022 Poster: Stochastic Rising Bandits »
Alberto Maria Metelli · Francesco Trovò · Matteo Pirola · Marcello Restelli -
2022 Spotlight: Stochastic Rising Bandits »
Alberto Maria Metelli · Francesco Trovò · Matteo Pirola · Marcello Restelli -
2021 Poster: Leveraging Good Representations in Linear Contextual Bandits »
Matteo Papini · Andrea Tirinzoni · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2021 Spotlight: Leveraging Good Representations in Linear Contextual Bandits »
Matteo Papini · Andrea Tirinzoni · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2021 Poster: Provably Efficient Learning of Transferable Rewards »
Alberto Maria Metelli · Giorgia Ramponi · Alessandro Concetti · Marcello Restelli -
2021 Spotlight: Provably Efficient Learning of Transferable Rewards »
Alberto Maria Metelli · Giorgia Ramponi · Alessandro Concetti · Marcello Restelli -
2020 Poster: Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning »
Alberto Maria Metelli · Flavio Mazzolini · Lorenzo Bisi · Luca Sabbioni · Marcello Restelli -
2020 Poster: Sequential Transfer in Reinforcement Learning with a Generative Model »
Andrea Tirinzoni · Riccardo Poiani · Marcello Restelli -
2019 Poster: Reinforcement Learning in Configurable Continuous Environments »
Alberto Maria Metelli · Emanuele Ghelfi · Marcello Restelli -
2019 Oral: Reinforcement Learning in Configurable Continuous Environments »
Alberto Maria Metelli · Emanuele Ghelfi · Marcello Restelli -
2019 Poster: Transfer of Samples in Policy Search via Multiple Importance Sampling »
Andrea Tirinzoni · Mattia Salvini · Marcello Restelli -
2019 Oral: Transfer of Samples in Policy Search via Multiple Importance Sampling »
Andrea Tirinzoni · Mattia Salvini · Marcello Restelli -
2019 Poster: Optimistic Policy Optimization via Multiple Importance Sampling »
Matteo Papini · Alberto Maria Metelli · Lorenzo Lupo · Marcello Restelli -
2019 Oral: Optimistic Policy Optimization via Multiple Importance Sampling »
Matteo Papini · Alberto Maria Metelli · Lorenzo Lupo · Marcello Restelli -
2018 Poster: Importance Weighted Transfer of Samples in Reinforcement Learning »
Andrea Tirinzoni · Andrea Sessa · Matteo Pirotta · Marcello Restelli -
2018 Poster: Stochastic Variance-Reduced Policy Gradient »
Matteo Papini · Damiano Binaghi · Giuseppe Canonaco · Matteo Pirotta · Marcello Restelli -
2018 Poster: Configurable Markov Decision Processes »
Alberto Maria Metelli · Mirco Mutti · Marcello Restelli -
2018 Oral: Importance Weighted Transfer of Samples in Reinforcement Learning »
Andrea Tirinzoni · Andrea Sessa · Matteo Pirotta · Marcello Restelli -
2018 Oral: Configurable Markov Decision Processes »
Alberto Maria Metelli · Mirco Mutti · Marcello Restelli -
2018 Oral: Stochastic Variance-Reduced Policy Gradient »
Matteo Papini · Damiano Binaghi · Giuseppe Canonaco · Matteo Pirotta · Marcello Restelli -
2017 Poster: Boosted Fitted Q-Iteration »
Samuele Tosatto · Matteo Pirotta · Carlo D'Eramo · Marcello Restelli -
2017 Talk: Boosted Fitted Q-Iteration »
Samuele Tosatto · Matteo Pirotta · Carlo D'Eramo · Marcello Restelli