Timezone: »

Interpretable learning-to-defer for sequential decision-making
Shalmali Joshi · Sonali Parbhoo · Finale Doshi-Velez

We focus on the problem of learning-to-defer to an expert under non-stationary dynamics in a sequential decision-making setting, by identifying pre-emptive deferral strategies. Pre-emptive deferral strategies are desirable when delaying deferral can result in suboptimal or undesirable long term outcomes, e.g. unexpected potential side-effects of a treatment. We formalize a deferral policy as being pre-emptive if delaying deferral does not lead to improved long-term outcomes. Our method, Sequential Learning-to-Defer (SLTD), explicitly measures the (expected) value of deferring now versus later based on the underlying uncertainty in non-stationary dynamics via posterior sampling. We demonstrate that capturing this uncertainty can allow us to test whether delaying deferral can help improve mean outcomes, and also provides domain experts with an indication of when the model's performance is reliable. Finally, we show that our approach outperforms existing non-sequential learning-to-defer baselines, whilst reducing overall uncertainty on multiple synthetic and semi-synthetic (Sepsis-Diabetes) simulators.

Author Information

Shalmali Joshi (Harvard University (SEAS))
Sonali Parbhoo (Harvard University)
Finale Doshi-Velez (Harvard University)
Finale Doshi-Velez

Finale Doshi-Velez is a Gordon McKay Professor in Computer Science at the Harvard Paulson School of Engineering and Applied Sciences. She completed her MSc from the University of Cambridge as a Marshall Scholar, her PhD from MIT, and her postdoc at Harvard Medical School. Her interests lie at the intersection of machine learning, healthcare, and interpretability. Selected Additional Shinies: BECA recipient, AFOSR YIP and NSF CAREER recipient; Sloan Fellow; IEEE AI Top 10 to Watch

More from the Same Authors