Abstract:
We study the problems of *distributed online and bandit convex optimization* against an adaptive adversary. We aim to minimize the average regret on $M$ machines working in parallel over $T$ rounds with $R$ intermittent communications. Assuming the underlying cost functions are convex and can be generated adaptively, our results show that *collaboration is not beneficial when the machines have access to the first-order gradient information at the queried points*. This is in contrast to the case for stochastic functions, where each machine samples the cost functions from a fixed distribution. Furthermore, we delve into the more challenging setting of *federated online optimization with bandit (zeroth-order) feedback*, where the machines can only access values of the cost functions at the queried points. The key finding here is *identifying the high-dimensional regime where collaboration is beneficial and may even lead to a linear speedup in the number of machines*. We further illustrate our findings through federated adversarial linear bandits by developing novel distributed single and two-point feedback algorithms. Our work is the first attempt towards a systematic understanding of federated online optimization with limited feedback, and it attains tight regret bounds in the intermittent communication setting for both first and zeroth-order feedback. Our results thus bridge the gap between stochastic and adaptive settings in federated online optimization.
Chat is not available.