Workshop: Complex feedback in online learning

Threshold Bandit Problem with Link Assumption between Pulls and Duels

Keshav Narayan · Aarti Singh


Building on prior work (Xu et al., 2020), we pro-pose a newer version of the Rank-Search algo-rithm, which aims to solve the Threshold BanditProblem when both duels and pulls are possible.Our algorithm, which requires an additional as-sumption connecting the borda scores of armswith their mean rewards, is able to perform in set-tings in which previous Rank-Search algorithmscannot. We conduct experiments comparing theperformance of the new algorithm with that ofolder versions of Rank-Search in various settings.Finally we prove upper bounds for the total num-ber of duels and pulls required by our proposedalgorithm, which we call Rank Search with Link(RS-L).

Chat is not available.