Skip to yearly menu bar Skip to main content


Poster

Adaptive Sample Sharing for Multi Agent Linear Bandits

Hamza Cherkaoui · Merwan Barlier · Igor Colin

West Exhibition Hall B2-B3 #W-918
[ ] [ ]
Tue 15 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract:

The multi-agent linear bandit setting is a well-known setting for which designing efficient collaboration between agents remains challenging. This paper studies the impact of data sharing among agents on regret minimization. Unlike most existing approaches, our contribution does not rely on any assumptions on the bandit parameters structure. Our main result formalizes the trade-off between the bias and uncertainty of the bandit parameter estimation for efficient collaboration. This result is the cornerstone of the Bandit Adaptive Sample Sharing (BASS) algorithm, whose efficiency over the current state-of-the-art is validated through both theoretical analysis and empirical evaluations on both synthetic and real-world datasets. Furthermore, we demonstrate that, when agents' parameters display a cluster structure, our algorithm accurately recovers them.

Lay Summary:

Recommendation algorithms—like those used on video platforms—often serve users with similar tastes. While a recommendation system could benefit from sharing what it has learned about one user, doing so effectively requires identifying when user preferences overlap. This motivated us to explore how such systems can collaborate to accelerate learning.We developed BASS, a method that enables algorithms to decide when and with whom to share information. It uses observed behavior to detect when recommendation systems are learning from similar user groups and shares information only when it improves performance. Notably, BASS requires no prior knowledge about which systems are related.This approach makes collaboration between learning systems more efficient and impactful. Whether applied to apps, devices, or content platforms, BASS helps them learn faster by leveraging shared patterns across users. Experiments on both synthetic and real-world data show that BASS consistently outperforms existing methods.

Live content is unavailable. Log in and register to view live content