Timezone: »
Poster
Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging
Ping-Chun Hsieh · Xi Liu · Anirban Bhattacharya · P R Kumar
Sequential decision making for lifetime maximization is a critical problem in many real-world applications, such as medical treatment and portfolio selection. In these applications, a ``reneging'' phenomenon, where participants may disengage from future interactions after observing an unsatisfiable outcome, is rather prevalent. To address the above issue, this paper proposes a model of heteroscedastic linear bandits with reneging, which allows each participant to have a distinct ``satisfaction level," with any interaction outcome falling short of that level resulting in that participant reneging. Moreover, it allows the variance of the outcome to be context-dependent. Based on this model, we develop a UCB-type policy, namely HR-UCB, and prove that it achieves $\mathcal{O}\big(\sqrt{{T}(\log({T}))^{3}}\big)$ regret. Finally, we validate the performance of HR-UCB via simulations.
Author Information
Ping-Chun Hsieh (Texas A&M University)
Xi Liu (Texas A&M University)
Anirban Bhattacharya (Texas A&M University)
P R Kumar (Texas A & M University)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Oral: Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging »
Wed Jun 12th 06:30 -- 06:35 PM Room Seaside Ballroom
More from the Same Authors
-
2020 Poster: Exploration Through Reward Biasing: Reward-Biased Maximum Likelihood Estimation for Stochastic Multi-Armed Bandits »
Xi Liu · Ping-Chun Hsieh · Yu Heng Hung · Anirban Bhattacharya · P. Kumar