Poster
in
Workshop: Reinforcement Learning for Real Life
Optimizing Dynamic Treatment Regimes via Volatile Contextual Gaussian Process Bandits
Ahmet Alparslan Celik · Cem Tekin
Management of chronic diseases such as diabetes mellitus requires adaptation of treatment regimes based on patient characteristics and response. There is no single treatment that fits all patients in all contexts; moreover, the set of admissible treatments usually varies over the course of the disease. In this paper, we address the problem of optimizing treatment regimes under time-varying constraints by using volatile contextual Gaussian process bandits. In particular, we propose a variant of GP-UCB with volatile arms, which takes into account the patient's context together with the set of admissible treatments when recommending new treatments. Our Bayesian approach is able to provide treatment recommendations to the patients along with confidence bounds which can be used for risk assessment. We use our algorithm to recommend bolus insulin doses for type 1 diabetes mellitus patients. Simulation studies show that our algorithm compares favorably with traditional blood glucose regulation methods.