Timezone: »

Stochastic Linear Bandits with Unknown Safety Constraints and Local Feedback
Nithin Varma · Sahin Lale · Anima Anandkumar
Event URL: https://openreview.net/forum?id=xFXaZXLhpK »
In many real-world decision-making tasks, e.g. clinical trials, the agents must satisfy a diverse set of unknown safety constraints at all times while getting feedback only on the safety constraints relevant to the chosen action, e.g. the ones close to violation. In this work, we study stochastic linear bandits with such unknown safety constraints and local safety feedback. The agent's goal is to maximize the cumulative reward while satisfying \textit{multiple unknown affine or nonlinear} safety constraints. At each time step, the agent receives noisy feedback on a particular safety constraint \textit{only if} the chosen action belongs to the associated constraint set, i.e. local safety feedback. For this setting, we design upper confidence bound and Thompson Sampling-based algorithms. In the design of these algorithms, we carefully prescribe an additional exploration incentive that guarantees the selection of high-reward actions that are also safe and ensures sufficient exploration in the relevant constraint sets to recover the optimal safe action. We show that for $M$ distinct constraints, both of these algorithms attain $\tilde{\mathcal{O}}(\sqrt{MT})$ regret after $T$ time steps without any safety violations. We empirically study the performance of the proposed algorithms under various safety constraints and with a real-world credit dataset. We show that both algorithms safely explore and quickly recover the optimal safe actions.

Author Information

Nithin Varma (IIT Madras)
Sahin Lale (California Institute of Technology)
Anima Anandkumar (Caltech and NVIDIA)

Anima Anandkumar is a Bren Professor at Caltech and Director of ML Research at NVIDIA. She was previously a Principal Scientist at Amazon Web Services. She is passionate about designing principled AI algorithms and applying them to interdisciplinary domains. She has received several honors such as the IEEE fellowship, Alfred. P. Sloan Fellowship, NSF Career Award, Young investigator awards from DoD, Venturebeat’s “women in AI” award, NYTimes GoodTech award, and Faculty Fellowships from Microsoft, Google, Facebook, and Adobe. She is part of the World Economic Forum's Expert Network. She has appeared in the PBS Frontline documentary on the “Amazon empire” and has given keynotes in many forums such as the TEDx, KDD, ICLR, and ACM. Anima received her BTech from Indian Institute of Technology Madras, her PhD from Cornell University, and did her postdoctoral research at MIT and assistant professorship at University of California Irvine.

More from the Same Authors