Timezone: »

 
Under-exploring in Bandits with Confounded Data
Nihal Sharma · Soumya Basu · Karthikeyan Shanmugam · Sanjay Shakkottai

We study the problem of Multi-Armed Bandits with mean bounds where each arm is associated with an interval in which its mean reward lies. We develop the GLobal Under-Explore (GLUE) algorithm which, for each arm, uses these intervals to infer ``pseudo-variances'' that instruct the rate of exploration. We provide regret guarantees for GLUE and show that it is never worse than the standard Upper Confidence Bound Algorithm. Further, we show regimes in which GLUE improves upon existing regret guarantees for structured bandit problems. Finally, we present the practical setting of learning adaptive interventions using prior confounded data in which unrecorded variables affect rewards. We show that mean bounds for each intervention can be extracted from such logs and can thus be used to improve the learning process. We also provide semi-synthetic experiments on real-world data sets to validate our findings.

Author Information

Nihal Sharma (The University of Texas at Austin)
Soumya Basu (Google)
Karthikeyan Shanmugam (IBM Research NY)

I am currently a Research Staff Member with the IBM Research AI group, NY since 2017. Previously, I was a Herman Goldstine Postdoctoral Fellow in the Math Sciences Division at IBM Research, NY. I obtained my Ph.D. in Electrical and Computer Engineering from UT Austin in summer 2016. My advisor at UT was Alex Dimakis. I obtained my MS degree in Electrical Engineering (2010-2012) from the University of Southern California, B.Tech and M.Tech degrees in Electrical Engineering from IIT Madras in 2010. My research interests broadly lie in Graph algorithms, Machine learning, Optimization, Coding Theory and Information Theory. In machine learning, my recent focus is on graphical model learning, causal inference and explainability. I also work on problems relating to information flow, storage and caching over networks.

Sanjay Shakkottai (University of Texas at Austin)

More from the Same Authors