Timezone: »

Leveraging Factored Action Spaces for Efficient Offline Reinforcement Learning in Healthcare
Shengpu Tang · Maggie Makar · Michael Sjoding · Finale Doshi-Velez · Jenna Wiens
Event URL: https://openreview.net/forum?id=wl_o_hilncS »

Many reinforcement learning (RL) applications have combinatorial action spaces, where each action is a composition of sub-actions. A standard RL approach ignores this inherent factorization structure, resulting in a potential failure to make meaningful inferences about rarely observed sub-action combinations; this is particularly problematic for offline settings, where data may be limited. In this work, we propose a form of linear Q-function decomposition induced by factored action spaces. We study the theoretical properties of our approach, identifying scenarios where it is guaranteed to lead to zero bias when used to approximate the Q-function. Outside the regimes with theoretical guarantees, we show that our approach can still be useful because it leads to better sample efficiency without necessarily sacrificing policy optimality, allowing us to achieve a better bias-variance trade-off. Across several offline RL problems using simulators and real-world datasets motivated by healthcare problems, we demonstrate that incorporating factored action spaces into value-based RL can result in better-performing policies. Our approach can help an agent make more accurate inferences within under-explored regions of the state-action space when applying RL to observational datasets.

Author Information

Shengpu Tang (University of Michigan)
Shengpu Tang

Shengpu Tang is a PhD candidate in the computer science department at the University of Michigan. He is a member of the Machine Learning for Data-Driven Decisions (MLD3) research group led by Jenna Wiens. His current research focuses on developing computational methods that help solve important problems in healthcare, such as risk stratification and dynamic treatment recommendations. More generally, he is interested in broader applications of AI/ML, reinforcement learning and graph mining, computer game design, self-driving cars, security and hacking, as well as teaching. For more details, see his website at https://shengpu-tang.me/

Maggie Makar (University of Michigan)
Michael Sjoding
Finale Doshi-Velez (Harvard University)
Jenna Wiens (University of Michigan)

More from the Same Authors