Timezone: »
This paper is about the study of B-FQI, an Approximated Value Iteration (AVI) algorithm that exploits a boosting procedure to estimate the action-value function in reinforcement learning problems. B-FQI is an iterative off-line algorithm that, given a dataset of transitions, builds an approximation of the optimal action-value function by summing the approximations of the Bellman residuals across all iterations. The advantage of such approach w.r.t. to other AVI methods is twofold: (1) while keeping the same function space at each iteration, B-FQI can represent more complex functions by considering an additive model; (2) since the Bellman residual decreases as the optimal value function is approached, regression problems become easier as iterations proceed. We study B-FQI both theoretically, providing also a finite-sample error upper bound for it, and empirically, by comparing its performance to the one of FQI in different domains and using different regression techniques.
Author Information
Samuele Tosatto (Politecnico di Milano)
Samuele Tosatto joined the Institute for Intelligent Autonomous Systems (IAS) at TU Darmstadt in May 2017 as a Ph.D. student. Samuele received his bachelor degree as well as his master degree in software engineering from Polytechnic University of Milan. During his studies he focused on machine learning and more in particular in reinforcement learning. His thesis, entitled “Boosted Fitted Q-Iteration", was written under the supervision of Prof. Marcello Restelli PhD Matteo Pirotta and Ing Carlo D'Eramo.
Matteo Pirotta (SequeL - Inria Lille - Nord Europe)
Carlo D'Eramo (Politecnico di Milano)
Marcello Restelli (Politecnico di Milano)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Poster: Boosted Fitted Q-Iteration »
Mon. Aug 7th 08:30 AM -- 12:00 PM Room Gallery #28
More from the Same Authors
-
2021 : Meta Learning the Step Size in Policy Gradient Methods »
Luca Sabbioni · Francesco Corda · Marcello Restelli -
2021 : Subgaussian Importance Sampling for Off-Policy Evaluation and Learning »
Alberto Maria Metelli · Alessio Russo · Marcello Restelli -
2021 : The Importance of Non-Markovianity in Maximum State Entropy Exploration »
Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2021 : Efficient Inverse Reinforcement Learning of Transferable Rewards »
Giorgia Ramponi · Alberto Maria Metelli · Marcello Restelli -
2021 : Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection »
Matteo Papini · Andrea Tirinzoni · Aldo Pacchiano · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2021 : Reward-Free Policy Space Compression for Reinforcement Learning »
Mirco Mutti · Stefano Del Col · Marcello Restelli -
2021 : Learning to Explore Multiple Environments without Rewards »
Mirco Mutti · Mattia Mancassola · Marcello Restelli -
2021 : The Importance of Non-Markovianity in Maximum State Entropy Exploration »
Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2022 : Challenging Common Assumptions in Convex Reinforcement Learning »
Mirco Mutti · Riccardo De Santi · Piersilvio De Bartolomeis · Marcello Restelli -
2022 : Stochastic Rising Bandits for Online Model Selection »
Alberto Maria Metelli · Francesco Trovò · Matteo Pirola · Marcello Restelli -
2022 : Dynamical Linear Bandits for Long-Lasting Vanishing Rewards »
Marco Mussi · Alberto Maria Metelli · Marcello Restelli -
2022 : Invariance Discovery for Systematic Generalization in Reinforcement Learning »
Mirco Mutti · Riccardo De Santi · Emanuele Rossi · Juan Calderon · Michael Bronstein · Marcello Restelli -
2022 : Recursive History Representations for Unsupervised Reinforcement Learning in Multiple-Environments »
Mirco Mutti · Pietro Maldini · Riccardo De Santi · Marcello Restelli -
2022 : Directed Exploration via Uncertainty-Aware Critics »
Amarildo Likmeta · Matteo Sacco · Alberto Maria Metelli · Marcello Restelli -
2022 : Non-Markovian Policies for Unsupervised Reinforcement Learning in Multiple Environments »
Pietro Maldini · Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2023 : A Best Arm Identification Approach for Stochastic Rising Bandits »
Alessandro Montenegro · Marco Mussi · Francesco Trovò · Marcello Restelli · Alberto Maria Metelli -
2023 : Parameterized projected Bellman operator »
Théo Vincent · Alberto Maria Metelli · Jan Peters · Marcello Restelli · Carlo D'Eramo -
2023 Poster: Dynamical Linear Bandits »
Marco Mussi · Alberto Maria Metelli · Marcello Restelli -
2023 Oral: Towards Theoretical Understanding of Inverse Reinforcement Learning »
Alberto Maria Metelli · Filippo Lazzati · Marcello Restelli -
2023 Poster: Towards Theoretical Understanding of Inverse Reinforcement Learning »
Alberto Maria Metelli · Filippo Lazzati · Marcello Restelli -
2023 Poster: Truncating Trajectories in Monte Carlo Reinforcement Learning »
Riccardo Poiani · Alberto Maria Metelli · Marcello Restelli -
2022 Poster: The Importance of Non-Markovianity in Maximum State Entropy Exploration »
Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2022 Poster: Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning »
Angelo Damiani · Giorgio Manganini · Alberto Maria Metelli · Marcello Restelli -
2022 Spotlight: Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning »
Angelo Damiani · Giorgio Manganini · Alberto Maria Metelli · Marcello Restelli -
2022 Oral: The Importance of Non-Markovianity in Maximum State Entropy Exploration »
Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2022 Poster: Stochastic Rising Bandits »
Alberto Maria Metelli · Francesco Trovò · Matteo Pirola · Marcello Restelli -
2022 Poster: Delayed Reinforcement Learning by Imitation »
Pierre Liotet · Davide Maran · Lorenzo Bisi · Marcello Restelli -
2022 Spotlight: Delayed Reinforcement Learning by Imitation »
Pierre Liotet · Davide Maran · Lorenzo Bisi · Marcello Restelli -
2022 Spotlight: Stochastic Rising Bandits »
Alberto Maria Metelli · Francesco Trovò · Matteo Pirola · Marcello Restelli -
2021 Poster: Leveraging Good Representations in Linear Contextual Bandits »
Matteo Papini · Andrea Tirinzoni · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2021 Spotlight: Leveraging Good Representations in Linear Contextual Bandits »
Matteo Papini · Andrea Tirinzoni · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2021 Poster: Provably Efficient Learning of Transferable Rewards »
Alberto Maria Metelli · Giorgia Ramponi · Alessandro Concetti · Marcello Restelli -
2021 Spotlight: Provably Efficient Learning of Transferable Rewards »
Alberto Maria Metelli · Giorgia Ramponi · Alessandro Concetti · Marcello Restelli -
2020 Poster: Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning »
Alberto Maria Metelli · Flavio Mazzolini · Lorenzo Bisi · Luca Sabbioni · Marcello Restelli -
2020 Poster: Sequential Transfer in Reinforcement Learning with a Generative Model »
Andrea Tirinzoni · Riccardo Poiani · Marcello Restelli -
2019 Poster: Reinforcement Learning in Configurable Continuous Environments »
Alberto Maria Metelli · Emanuele Ghelfi · Marcello Restelli -
2019 Oral: Reinforcement Learning in Configurable Continuous Environments »
Alberto Maria Metelli · Emanuele Ghelfi · Marcello Restelli -
2019 Poster: Transfer of Samples in Policy Search via Multiple Importance Sampling »
Andrea Tirinzoni · Mattia Salvini · Marcello Restelli -
2019 Oral: Transfer of Samples in Policy Search via Multiple Importance Sampling »
Andrea Tirinzoni · Mattia Salvini · Marcello Restelli -
2019 Poster: Optimistic Policy Optimization via Multiple Importance Sampling »
Matteo Papini · Alberto Maria Metelli · Lorenzo Lupo · Marcello Restelli -
2019 Oral: Optimistic Policy Optimization via Multiple Importance Sampling »
Matteo Papini · Alberto Maria Metelli · Lorenzo Lupo · Marcello Restelli -
2018 Poster: Importance Weighted Transfer of Samples in Reinforcement Learning »
Andrea Tirinzoni · Andrea Sessa · Matteo Pirotta · Marcello Restelli -
2018 Poster: Stochastic Variance-Reduced Policy Gradient »
Matteo Papini · Damiano Binaghi · Giuseppe Canonaco · Matteo Pirotta · Marcello Restelli -
2018 Poster: Configurable Markov Decision Processes »
Alberto Maria Metelli · Mirco Mutti · Marcello Restelli -
2018 Oral: Importance Weighted Transfer of Samples in Reinforcement Learning »
Andrea Tirinzoni · Andrea Sessa · Matteo Pirotta · Marcello Restelli -
2018 Oral: Configurable Markov Decision Processes »
Alberto Maria Metelli · Mirco Mutti · Marcello Restelli -
2018 Oral: Stochastic Variance-Reduced Policy Gradient »
Matteo Papini · Damiano Binaghi · Giuseppe Canonaco · Matteo Pirotta · Marcello Restelli -
2018 Poster: Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning »
Ronan Fruit · Matteo Pirotta · Alessandro Lazaric · Ronald Ortner -
2018 Oral: Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning »
Ronan Fruit · Matteo Pirotta · Alessandro Lazaric · Ronald Ortner