Regret Minimization With a Crowd of Awakening Experts
Anna Lunghi ⋅ Gianmarco Genalti ⋅ Alberto Marchesi ⋅ Matteo Castiglioni
Abstract
We study the Awakening Crowd of Experts (ACE) problem, an online learning problem where the set of experts available to the learner grows at each round. ACE is a special case of the well-known sleeping experts problem (Kleinberg et al., 2010), where the number of experts is huge $(K=T)$. Existing results on sleeping experts preclude any learner from achieving a sublinear regret when the number of available experts is linear in $T$. Inspired by real-world applications, such as Q\&A platforms and social proof marketing, we thus focus on the awakening version of the sleeping experts problem, where a new expert arrives at every round and never leaves. We show that in the stochastic version of ACE, it is possible to obtain regret $\tilde{\mathcal{O}}(T^{2/3})$ using an unusual pessimism in the face of the uncertainty principle. Moreover, we characterize the dependence of the regret on the stability of an optimal strategy. For both results, we present matching lower bounds. Surprisingly, the adversarial version of ACE is sensibly harder. In particular, we provide a lower bound precluding sublinear $\alpha$-regret when the competitive ratio is constant. We provide an algorithm to face this crucial trade-off between competitive ratio and regret, and bound its $\alpha$-regret, almost matching the aforementioned lower bound. As a corollary, we get a $\tilde{\mathcal{O}}(\log(\log(T))$ competitive ratio when an optimal strategy enjoys a reward linear in $T$.
Successful Page Load