Timezone: »
Motivated by applications to online advertising and recommender systems, we consider a game-theoretic model with delayed rewards and asynchronous, payoff-based feedback. In contrast to previous work on delayed multi-armed bandits, we focus on games with continuous action spaces, and we examine the long-run behavior of strategic agents that follow a no-regret learning policy (but are otherwise oblivious to the game being played, the objectives of their opponents, etc.). To account for the lack of a consistent stream of information (for instance, rewards can arrive out of order and with an a priori unbounded delay), we introduce a gradient-free learning policy where payoff information is placed in a priority queue as it arrives. Somewhat surprisingly, we find that under a standard diagonal concavity assumption, the induced sequence of play converges to Nash Equilibrium (NE) with probability 1, even if the delay between choosing an action and receiving the corresponding reward is unbounded.
Author Information
Amélie Héliou (Criteo)
Panayotis Mertikopoulos (CNRS and Criteo AI Lab)
Zhengyuan Zhou (Stanford University)
More from the Same Authors
-
2023 Poster: Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism »
Yu-Guan Hsieh · Franck Iutzeler · Jérôme Malick · Panayotis Mertikopoulos -
2022 Poster: Nested Bandits »
Matthieu Martin · Panayotis Mertikopoulos · Thibaud J Rahier · Houssam Zenati -
2022 Poster: UnderGrad: A Universal Black-Box Optimization Method with Almost Dimension-Free Convergence Rate Guarantees »
Kimon Antonakopoulos · Dong Quan Vu · Volkan Cevher · Kfir Levy · Panayotis Mertikopoulos -
2022 Oral: UnderGrad: A Universal Black-Box Optimization Method with Almost Dimension-Free Convergence Rate Guarantees »
Kimon Antonakopoulos · Dong Quan Vu · Volkan Cevher · Kfir Levy · Panayotis Mertikopoulos -
2022 Spotlight: Nested Bandits »
Matthieu Martin · Panayotis Mertikopoulos · Thibaud J Rahier · Houssam Zenati -
2022 Poster: AdaGrad Avoids Saddle Points »
Kimon Antonakopoulos · Panayotis Mertikopoulos · Georgios Piliouras · Xiao Wang -
2022 Spotlight: AdaGrad Avoids Saddle Points »
Kimon Antonakopoulos · Panayotis Mertikopoulos · Georgios Piliouras · Xiao Wang -
2021 Poster: The Limits of Min-Max Optimization Algorithms: Convergence to Spurious Non-Critical Sets »
Ya-Ping Hsieh · Panayotis Mertikopoulos · Volkan Cevher -
2021 Poster: Regret Minimization in Stochastic Non-Convex Learning via a Proximal-Gradient Approach »
Nadav Hallak · Panayotis Mertikopoulos · Volkan Cevher -
2021 Spotlight: Regret Minimization in Stochastic Non-Convex Learning via a Proximal-Gradient Approach »
Nadav Hallak · Panayotis Mertikopoulos · Volkan Cevher -
2021 Oral: The Limits of Min-Max Optimization Algorithms: Convergence to Spurious Non-Critical Sets »
Ya-Ping Hsieh · Panayotis Mertikopoulos · Volkan Cevher -
2021 Poster: Zeroth-Order Non-Convex Learning via Hierarchical Dual Averaging »
Amélie Héliou · Matthieu Martin · Panayotis Mertikopoulos · Thibaud J Rahier -
2021 Spotlight: Zeroth-Order Non-Convex Learning via Hierarchical Dual Averaging »
Amélie Héliou · Matthieu Martin · Panayotis Mertikopoulos · Thibaud J Rahier -
2020 Poster: A new regret analysis for Adam-type algorithms »
Ahmet Alacaoglu · Yura Malitsky · Panayotis Mertikopoulos · Volkan Cevher -
2020 Poster: Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games »
Tianyi Lin · Zhengyuan Zhou · Panayotis Mertikopoulos · Michael Jordan -
2020 Poster: Distributionally Robust Policy Evaluation and Learning in Offline Contextual Bandits »
Nian Si · Fan Zhang · Zhengyuan Zhou · Jose Blanchet -
2019 Poster: Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints »
Nikolaos Liakopoulos · Apostolos Destounis · Georgios Paschos · Thrasyvoulos Spyropoulos · Panayotis Mertikopoulos -
2019 Oral: Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints »
Nikolaos Liakopoulos · Apostolos Destounis · Georgios Paschos · Thrasyvoulos Spyropoulos · Panayotis Mertikopoulos -
2018 Poster: MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels »
Lu Jiang · Zhengyuan Zhou · Thomas Leung · Li-Jia Li · Li Fei-Fei -
2018 Poster: Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go? »
Zhengyuan Zhou · Panayotis Mertikopoulos · Nicholas Bambos · Peter Glynn · Yinyu Ye · Li-Jia Li · Li Fei-Fei -
2018 Oral: MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels »
Lu Jiang · Zhengyuan Zhou · Thomas Leung · Li-Jia Li · Li Fei-Fei -
2018 Oral: Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go? »
Zhengyuan Zhou · Panayotis Mertikopoulos · Nicholas Bambos · Peter Glynn · Yinyu Ye · Li-Jia Li · Li Fei-Fei