Poster
 in 
Workshop: 3rd Workshop on Interpretable Machine Learning in Healthcare (IMLH)
                        
                    
                    Auditing for Human Expertise
Rohan Alur · Loren Laine · Darrick Li · Manish Raghavan · Devavrat Shah · Dennis Shung
Keywords: [ Machine learning for healthcare ] [ human-AI complementarity ] [ hypothesis testing ] [ interpretability in machine learning ]
High-stakes prediction tasks (e.g., patient diagnosis) are often handled by trained human experts. A common source of concern about automation in these settings is that experts may exercise intuition that is difficult to model and/or have access to information (e.g., conversations with a patient) that is simply unavailable to a would-be algorithm. This raises a natural question whether human experts {\em add value} which could not be captured by an algorithmic predictor. In this work, we develop a statistical framework under which we can pose this question as a natural hypothesis test. We highlight the utility of our procedure using admissions data collected from the emergency department of a large academic hospital system, where we show that physicians' admit/discharge decisions for patients with acute gastrointestinal bleeding (AGIB) appear to be incorporating information not captured in a standard algorithmic screening tool. This is despite the fact that the screening tool is arguably more accurate than physicians' discretionary decisions, highlighting that -- even absent normative concerns about accountability or interpretability -- accuracy is insufficient to justify algorithmic automation.