Timezone: »
Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-the-art algorithm for solving sequential games that are too large for full tree traversals. It works by using gradient estimates that can be computed via sampling. However, stochastic methods for sequential games have not been investigated extensively beyond MCCFR. In this paper we develop a new framework for developing stochastic regret minimization methods. This framework allows us to use any regret-minimization algorithm, coupled with any gradient estimator. The MCCFR algorithm can be analyzed as a special case of our framework, and this analysis leads to significantly stronger theoretical guarantees on convergence, while simultaneously yielding a simplified proof. Our framework allows us to instantiate several new stochastic methods for solving sequential games. We show extensive experiments on five games, where some variants of our methods outperform MCCFR.
Author Information
Gabriele Farina (Carnegie Mellon University)
I am currently a first-year Ph.D. student in the Computer Science Department at Carnegie Mellon University, where I am fortunate to be advised by Tuomas Sandholm. I am part of the Electronics Marketplaces Lab. I mostly work on Kidney Exchange and Algorithmic Game Theory.
Christian Kroer (Columbia University)
Tuomas Sandholm (Carnegie Mellon University)
Tuomas Sandholm is Angel Jordan Professor of Computer Science at Carnegie Mellon University. He is Founder and Director of the Electronic Marketplaces Laboratory. He has published over 450 papers. With his student Vince Conitzer, he initiated the study of automated mechanism design in 2001. In parallel with his academic career, he was Founder, Chairman, and CTO/Chief Scientist of CombineNet, Inc. from 1997 until its acquisition in 2010. During this period the company commercialized over 800 of the world's largest-scale generalized combinatorial multi-attribute auctions, with over $60 billion in total spend and over $6 billion in generated savings. He is Founder and CEO of Optimized Markets, Strategic Machine, and Strategy Robot. Also, his algorithms run the UNOS kidney exchange, which includes 69% of the transplant centers in the US. He has developed the leading algorithms for several general classes of game. The team that he leads is the two-time world champion in computer Heads-Up No-Limit Texas Hold’em poker, and Libratus became the first and only AI to beat top humans at that game. Among his many honors are the NSF Career Award, inaugural ACM Autonomous Agents Research Award, Sloan Fellowship, Carnegie Science Center Award for Excellence, Edelman Laureateship, Newell Award for Research Excellence, and Computers and Thought Award. He is Fellow of the ACM, AAAI, and INFORMS. He holds an honorary doctorate from the University of Zurich.
More from the Same Authors
-
2020 Poster: Refined bounds for algorithm configuration: The knife-edge of dual class approximability »
Nina Balcan · Tuomas Sandholm · Ellen Vitercik -
2020 Poster: Sparsified Linear Programming for Zero-Sum Equilibrium Finding »
Brian Zhang · Tuomas Sandholm -
2019 Poster: Deep Counterfactual Regret Minimization »
Noam Brown · Adam Lerer · Sam Gross · Tuomas Sandholm -
2019 Poster: Stable-Predictive Optimistic Counterfactual Regret Minimization »
Gabriele Farina · Christian Kroer · Noam Brown · Tuomas Sandholm -
2019 Poster: Regret Circuits: Composability of Regret Minimizers »
Gabriele Farina · Christian Kroer · Tuomas Sandholm -
2019 Oral: Deep Counterfactual Regret Minimization »
Noam Brown · Adam Lerer · Sam Gross · Tuomas Sandholm -
2019 Oral: Stable-Predictive Optimistic Counterfactual Regret Minimization »
Gabriele Farina · Christian Kroer · Noam Brown · Tuomas Sandholm -
2019 Oral: Regret Circuits: Composability of Regret Minimizers »
Gabriele Farina · Christian Kroer · Tuomas Sandholm -
2018 Poster: Learning to Branch »
Nina Balcan · Travis Dick · Tuomas Sandholm · Ellen Vitercik -
2018 Oral: Learning to Branch »
Nina Balcan · Travis Dick · Tuomas Sandholm · Ellen Vitercik -
2018 Tutorial: Machine Learning in Automated Mechanism Design for Pricing and Auctions »
Nina Balcan · Tuomas Sandholm · Ellen Vitercik -
2017 Poster: Regret Minimization in Behaviorally-Constrained Zero-Sum Games »
Gabriele Farina · Christian Kroer · Tuomas Sandholm -
2017 Poster: Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning »
Noam Brown · Tuomas Sandholm -
2017 Talk: Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning »
Noam Brown · Tuomas Sandholm -
2017 Talk: Regret Minimization in Behaviorally-Constrained Zero-Sum Games »
Gabriele Farina · Christian Kroer · Tuomas Sandholm