Timezone: »
Contextual bandit algorithms have become widely used for recommendation in online systems (e.g. marketplaces, music streaming, news), where they now wield substantial influence on which items get shown to users. This raises questions of fairness to the items --- and to the sellers, artists, and writers that benefit from this exposure. We argue that the conventional bandit formulation can lead to an undesirable and unfair winner-takes-all allocation of exposure. To remedy this problem, we propose a new bandit objective that guarantees merit-based fairness of exposure to the items while optimizing utility to the users. We formulate fairness regret and reward regret in this setting and present algorithms for both stochastic multi-armed bandits and stochastic linear bandits. We prove that the algorithms achieve sublinear fairness regret and reward regret. Beyond the theoretical analysis, we also provide empirical evidence that these algorithms can allocate exposure to different arms effectively.
Author Information
Luke Lequn Wang (Cornell University)
Yiwei Bai (Cornell University)
Wen Sun (Cornell University)
Thorsten Joachims (Cornell)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Fairness of Exposure in Stochastic Bandits »
Thu. Jul 22nd 04:00 -- 06:00 PM Room
More from the Same Authors
-
2021 : Corruption Robust Offline Reinforcement Learning »
Xuezhou Zhang · Yiding Chen · Jerry Zhu · Wen Sun -
2021 : Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage »
Jonathan Chang · Masatoshi Uehara · Dhruv Sreenivas · Rahul Kidambi · Wen Sun -
2021 : MobILE: Model-Based Imitation Learning From Observation Alone »
Rahul Kidambi · Jonathan Chang · Wen Sun -
2022 : Learning from Preference Feedback in Combinatorial Action Spaces »
Thorsten Joachims -
2022 Poster: Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning approach »
Xuezhou Zhang · Yuda Song · Masatoshi Uehara · Mengdi Wang · Alekh Agarwal · Wen Sun -
2022 Poster: Off-Policy Evaluation for Large Action Spaces via Embeddings »
Yuta Saito · Thorsten Joachims -
2022 Poster: Learning Bellman Complete Representations for Offline Policy Evaluation »
Jonathan Chang · Kaiwen Wang · Nathan Kallus · Wen Sun -
2022 Poster: Improving Screening Processes via Calibrated Subset Selection »
Luke Lequn Wang · Thorsten Joachims · Manuel Gomez-Rodriguez -
2022 Spotlight: Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning approach »
Xuezhou Zhang · Yuda Song · Masatoshi Uehara · Mengdi Wang · Alekh Agarwal · Wen Sun -
2022 Spotlight: Off-Policy Evaluation for Large Action Spaces via Embeddings »
Yuta Saito · Thorsten Joachims -
2022 Spotlight: Improving Screening Processes via Calibrated Subset Selection »
Luke Lequn Wang · Thorsten Joachims · Manuel Gomez-Rodriguez -
2022 Oral: Learning Bellman Complete Representations for Offline Policy Evaluation »
Jonathan Chang · Kaiwen Wang · Nathan Kallus · Wen Sun -
2021 Poster: Robust Policy Gradient against Strong Data Corruption »
Xuezhou Zhang · Yiding Chen · Jerry Zhu · Wen Sun -
2021 Spotlight: Robust Policy Gradient against Strong Data Corruption »
Xuezhou Zhang · Yiding Chen · Jerry Zhu · Wen Sun -
2021 Poster: Bilinear Classes: A Structural Framework for Provable Generalization in RL »
Simon Du · Sham Kakade · Jason Lee · Shachar Lovett · Gaurav Mahajan · Wen Sun · Ruosong Wang -
2021 Oral: Bilinear Classes: A Structural Framework for Provable Generalization in RL »
Simon Du · Sham Kakade · Jason Lee · Shachar Lovett · Gaurav Mahajan · Wen Sun · Ruosong Wang -
2021 Poster: PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration »
Yuda Song · Wen Sun -
2021 Spotlight: PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration »
Yuda Song · Wen Sun -
2019 Poster: CAB: Continuous Adaptive Blending for Policy Evaluation and Learning »
Yi Su · Luke Lequn Wang · Michele Santacatterina · Thorsten Joachims -
2019 Oral: CAB: Continuous Adaptive Blending for Policy Evaluation and Learning »
Yi Su · Luke Lequn Wang · Michele Santacatterina · Thorsten Joachims