Timezone: »
Stochastic Rising Bandits (SRBs) model sequential decision-making problems in which the expected rewards of the available options increase every time they are selected. This setting captures a wide range of scenarios in which the available options are learning entities whose performance improves (in expectation) over time. While previous works addressed the regret minimization problem, this paper, focuses on the fixed-budget Best Arm Identification (BAI) problem for SRBs. In this scenario, given a fixed budget of rounds, we are asked to provide a recommendation about the best option at the end of the identification process. We propose two algorithms to tackle the above-mentioned setting, namely R-UCBE, which resorts to a UCB-like approach, and R-SR, which employs a successive reject procedure. Then, we prove that, with a sufficiently large budget, they provide guarantees on the probability of properly identifying the optimal option at the end of the learning process. Furthermore, we derive a lower bound on the error probability, matched by our R-SR (up to logarithmic factors), and illustrate how the need for a sufficiently large budget is unavoidable in the SRB setting. Finally, we numerically validate the proposed algorithms in both synthetic and real-world environments and compare them with the currently available BAI strategies.
Author Information
Alessandro Montenegro (Polytechnic Institute of Milan)
Marco Mussi (Politecnico di Milano)
Francesco Trovò (Politecnico di Milano)
Marcello Restelli (Politecnico di Milano)
Alberto Maria Metelli (Politecnico di Milano)
More from the Same Authors
-
2021 : Meta Learning the Step Size in Policy Gradient Methods »
Luca Sabbioni · Francesco Corda · Marcello Restelli -
2021 : Subgaussian Importance Sampling for Off-Policy Evaluation and Learning »
Alberto Maria Metelli · Alessio Russo · Marcello Restelli -
2021 : The Importance of Non-Markovianity in Maximum State Entropy Exploration »
Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2021 : Efficient Inverse Reinforcement Learning of Transferable Rewards »
Giorgia Ramponi · Alberto Maria Metelli · Marcello Restelli -
2021 : Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection »
Matteo Papini · Andrea Tirinzoni · Aldo Pacchiano · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2021 : Reward-Free Policy Space Compression for Reinforcement Learning »
Mirco Mutti · Stefano Del Col · Marcello Restelli -
2021 : Learning to Explore Multiple Environments without Rewards »
Mirco Mutti · Mattia Mancassola · Marcello Restelli -
2021 : The Importance of Non-Markovianity in Maximum State Entropy Exploration »
Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2022 : Challenging Common Assumptions in Convex Reinforcement Learning »
Mirco Mutti · Riccardo De Santi · Piersilvio De Bartolomeis · Marcello Restelli -
2022 : Stochastic Rising Bandits for Online Model Selection »
Alberto Maria Metelli · Francesco Trovò · Matteo Pirola · Marcello Restelli -
2022 : Dynamical Linear Bandits for Long-Lasting Vanishing Rewards »
Marco Mussi · Alberto Maria Metelli · Marcello Restelli -
2022 : Invariance Discovery for Systematic Generalization in Reinforcement Learning »
Mirco Mutti · Riccardo De Santi · Emanuele Rossi · Juan Calderon · Michael Bronstein · Marcello Restelli -
2022 : Recursive History Representations for Unsupervised Reinforcement Learning in Multiple-Environments »
Mirco Mutti · Pietro Maldini · Riccardo De Santi · Marcello Restelli -
2022 : Directed Exploration via Uncertainty-Aware Critics »
Amarildo Likmeta · Matteo Sacco · Alberto Maria Metelli · Marcello Restelli -
2022 : Non-Markovian Policies for Unsupervised Reinforcement Learning in Multiple Environments »
Pietro Maldini · Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2023 : Parameterized projected Bellman operator »
Théo Vincent · Alberto Maria Metelli · Jan Peters · Marcello Restelli · Carlo D'Eramo -
2023 Poster: Optimal Rates and Efficient Algorithms for Online Bayesian Persuasion »
Martino Bernasconi · Matteo Castiglioni · Andrea Celli · Alberto Marchesi · Francesco Trovò · Nicola Gatti -
2023 Poster: Dynamical Linear Bandits »
Marco Mussi · Alberto Maria Metelli · Marcello Restelli -
2023 Oral: Towards Theoretical Understanding of Inverse Reinforcement Learning »
Alberto Maria Metelli · Filippo Lazzati · Marcello Restelli -
2023 Poster: Towards Theoretical Understanding of Inverse Reinforcement Learning »
Alberto Maria Metelli · Filippo Lazzati · Marcello Restelli -
2023 Poster: Constrained Phi-Equilibria »
Martino Bernasconi · Matteo Castiglioni · Alberto Marchesi · Francesco Trovò · Nicola Gatti -
2023 Poster: Truncating Trajectories in Monte Carlo Reinforcement Learning »
Riccardo Poiani · Alberto Maria Metelli · Marcello Restelli -
2022 Poster: The Importance of Non-Markovianity in Maximum State Entropy Exploration »
Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2022 Poster: Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning »
Angelo Damiani · Giorgio Manganini · Alberto Maria Metelli · Marcello Restelli -
2022 Poster: Safe Learning in Tree-Form Sequential Decision Making: Handling Hard and Soft Constraints »
Martino Bernasconi · Federico Cacciamani · Matteo Castiglioni · Alberto Marchesi · Nicola Gatti · Francesco Trovò -
2022 Spotlight: Safe Learning in Tree-Form Sequential Decision Making: Handling Hard and Soft Constraints »
Martino Bernasconi · Federico Cacciamani · Matteo Castiglioni · Alberto Marchesi · Nicola Gatti · Francesco Trovò -
2022 Spotlight: Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning »
Angelo Damiani · Giorgio Manganini · Alberto Maria Metelli · Marcello Restelli -
2022 Oral: The Importance of Non-Markovianity in Maximum State Entropy Exploration »
Mirco Mutti · Riccardo De Santi · Marcello Restelli -
2022 Poster: Stochastic Rising Bandits »
Alberto Maria Metelli · Francesco Trovò · Matteo Pirola · Marcello Restelli -
2022 Poster: Delayed Reinforcement Learning by Imitation »
Pierre Liotet · Davide Maran · Lorenzo Bisi · Marcello Restelli -
2022 Spotlight: Delayed Reinforcement Learning by Imitation »
Pierre Liotet · Davide Maran · Lorenzo Bisi · Marcello Restelli -
2022 Spotlight: Stochastic Rising Bandits »
Alberto Maria Metelli · Francesco Trovò · Matteo Pirola · Marcello Restelli -
2021 Poster: Leveraging Good Representations in Linear Contextual Bandits »
Matteo Papini · Andrea Tirinzoni · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2021 Spotlight: Leveraging Good Representations in Linear Contextual Bandits »
Matteo Papini · Andrea Tirinzoni · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2021 Poster: Provably Efficient Learning of Transferable Rewards »
Alberto Maria Metelli · Giorgia Ramponi · Alessandro Concetti · Marcello Restelli -
2021 Spotlight: Provably Efficient Learning of Transferable Rewards »
Alberto Maria Metelli · Giorgia Ramponi · Alessandro Concetti · Marcello Restelli -
2020 Poster: Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning »
Alberto Maria Metelli · Flavio Mazzolini · Lorenzo Bisi · Luca Sabbioni · Marcello Restelli -
2020 Poster: Sequential Transfer in Reinforcement Learning with a Generative Model »
Andrea Tirinzoni · Riccardo Poiani · Marcello Restelli -
2019 Poster: Reinforcement Learning in Configurable Continuous Environments »
Alberto Maria Metelli · Emanuele Ghelfi · Marcello Restelli -
2019 Oral: Reinforcement Learning in Configurable Continuous Environments »
Alberto Maria Metelli · Emanuele Ghelfi · Marcello Restelli -
2019 Poster: Transfer of Samples in Policy Search via Multiple Importance Sampling »
Andrea Tirinzoni · Mattia Salvini · Marcello Restelli -
2019 Oral: Transfer of Samples in Policy Search via Multiple Importance Sampling »
Andrea Tirinzoni · Mattia Salvini · Marcello Restelli -
2019 Poster: Optimistic Policy Optimization via Multiple Importance Sampling »
Matteo Papini · Alberto Maria Metelli · Lorenzo Lupo · Marcello Restelli -
2019 Oral: Optimistic Policy Optimization via Multiple Importance Sampling »
Matteo Papini · Alberto Maria Metelli · Lorenzo Lupo · Marcello Restelli -
2018 Poster: Importance Weighted Transfer of Samples in Reinforcement Learning »
Andrea Tirinzoni · Andrea Sessa · Matteo Pirotta · Marcello Restelli -
2018 Poster: Stochastic Variance-Reduced Policy Gradient »
Matteo Papini · Damiano Binaghi · Giuseppe Canonaco · Matteo Pirotta · Marcello Restelli -
2018 Poster: Configurable Markov Decision Processes »
Alberto Maria Metelli · Mirco Mutti · Marcello Restelli -
2018 Oral: Importance Weighted Transfer of Samples in Reinforcement Learning »
Andrea Tirinzoni · Andrea Sessa · Matteo Pirotta · Marcello Restelli -
2018 Oral: Configurable Markov Decision Processes »
Alberto Maria Metelli · Mirco Mutti · Marcello Restelli -
2018 Oral: Stochastic Variance-Reduced Policy Gradient »
Matteo Papini · Damiano Binaghi · Giuseppe Canonaco · Matteo Pirotta · Marcello Restelli -
2017 Poster: Boosted Fitted Q-Iteration »
Samuele Tosatto · Matteo Pirotta · Carlo D'Eramo · Marcello Restelli -
2017 Talk: Boosted Fitted Q-Iteration »
Samuele Tosatto · Matteo Pirotta · Carlo D'Eramo · Marcello Restelli