Poster
in
Workshop: 3rd Workshop on Interpretable Machine Learning in Healthcare (IMLH)
Automated Detection of Interpretable Causal Inference Opportunities: Regression Discontinuity Subgroup Discovery
Tong Liu · Patrick Lawlor · Lyle Ungar · Konrad Kording · Rahul Ladhania
Keywords: [ Regression discontinuity ] [ clinical guidelines ] [ compliance estimation ] [ Causal Inference ]
Treatment decisions based on cutoffs of continuous variables, such as the blood sugar threshold for diabetes diagnosis, provide valuable opportunities for causal inference. Regression discontinuities (RDs) are used to analyze such scenarios, where units just above and below the threshold differ only in their treatment assignment status, thus providing as-if randomization. In practice however, implementing RD studies can be difficult as identifying treatment thresholds require considerable domain expertise -- furthermore, the thresholds may differ across population subgroups (e.g., the blood sugar threshold for diabetes may differ across demographics), and ignoring these differences can lower statistical power. Here, we introduce Regression Discontinuity SubGroup Discovery (RDSGD), a machine learning method that identifies more powerful and interpretable subgroups for RD thresholds.Using a claims dataset with over 60 million patients, we apply our method to multiple clinical contexts and identify subgroups with increased compliance to treatment assignment thresholds.As subgroup-specific treatment thresholds are relevant to many diseases, RDSGD can be a powerful tool for discovering new avenues for causal estimation across a range of clinical applications.