Optimal Top-$k$ Identification from Pairwise Comparisons
Motti Goldberger ⋅ Nils Rudi
Abstract
We study the active learning problem of fixed-confidence top-$k$ identification from noisy pairwise comparisons under latent-utility models. The objective is to identify the top-$k$ items with probability at least $1-\delta$ while using as few comparisons as possible by adaptively selecting which pairs to compare. While pure exploration with dueling bandits has been studied, an algorithm achieving asymptotic optimality has not yet been established. We characterize the structure of the information-theoretic lower bound on sample complexity, revealing a structured saddle-point problem. This structure enables a primal--dual algorithm that learns the optimal comparison allocation online while being computationally efficient. We then construct an adaptive comparison-allocation strategy that tracks the optimal solution and prove that the resulting procedure is asymptotically optimal.
Successful Page Load