Online Learning with Dependent Stochastic Feedback Graphs

Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Ningshan Zhang

Keywords: [ Online Learning / Bandits ] [ Other ] [ Online Learning, Active Learning, and Bandits ]

[ Abstract ]
Thu 16 Jul 6 a.m. PDT — 6:45 a.m. PDT
Thu 16 Jul 5 p.m. PDT — 5:45 p.m. PDT


A general framework for online learning with partial information is one where feedback graphs specify which losses can be observed by the learner. We study a challenging scenario where feedback graphs vary stochastically with time and, more importantly, where graphs and losses are dependent. This scenario appears in several real-world applications that we describe where the outcome of actions are correlated. We devise a new algorithm for this setting that exploits the stochastic properties of the graphs and that benefits from favorable regret guarantees. We present a detailed theoretical analysis of this algorithm, and also report the result of a series of experiments on real-world datasets, which show that our algorithm outperforms standard baselines for online learning with feedback graphs.

Chat is not available.