ICML Poster Pure Exploration and Regret Minimization in Matching Bandits

Poster

Pure Exploration and Regret Minimization in Matching Bandits

Flore Sentenac · Jialin Yi · Clément Calauzènes · Vianney Perchet · Milan Vojnovic

Keywords: [ Reinforcement Learning and Planning ] [ Bandits ]

[ Abstract ] [ Paper PDF ]

[ Paper ]

[ Visit Poster at Spot C1 in Virtual World ]

Abstract:

Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to to poly-log terms).

Chat is not available.