Poster
Competing Bandits in Matching Markets via Super Stability
Soumya Basu
East Exhibition Hall A-B #E-1807
We study bandit learning in matching markets with two-sided reward uncertainty, extending prior research primarily focused on single-sided uncertainty. Leveraging the concept of `super-stability' from Irving (1994), we demonstrate the advantage of the Extended Gale-Shapley (GS) algorithm over the standard GS algorithm in achieving true stable matchings under incomplete information. By employing the Extended GS algorithm, our centralized algorithm attains a logarithmic pessimal stable regret dependent on an instance-dependent admissible gap parameter. This algorithm is further adapted to a decentralized setting with a constant regret increase. Finally, we establish a novel centralized instance-dependent lower bound for binary stable regret, elucidating the roles of the admissible gap and super-stable matching in characterizing the complexity of stable matching with bandit feedback.
Imagine trying to set up fair and lasting partnerships, like matching people for jobs or dates online, where neither side initially knows exactly what they want or how good a match will truly be. This research presents a smarter way to find these "stable" partnerships, especially in complex scenarios where both sides are learning as they go. By using a more advanced matching algorithm this paper shows how systems can quickly learn to make excellent matches with very few mistakes over time, whether they operate through a central system or in a decentralized way. This work also sets a new limits for understanding how difficult it is to achieve truly stable partnerships when information is limited.
Live content is unavailable. Log in and register to view live content