Poster
in
Workshop: 3rd Workshop on Interpretable Machine Learning in Healthcare (IMLH)

(Un)reasonable Allure of Ante-hoc Interpretability for High-stakes Domains: Transparency Is Necessary but Insufficient for Explainability

Kacper Sokol · Julia Vogt

Keywords: Explainable Artificial Intelligence Provenance Lineage Interpretable Machine Learning Definition modularity Reasoning

Project Page [ OpenReview]

Abstract

Ante-hoc interpretability has become the holy grail of explainable machine learning for high-stakes domains such as healthcare; however, this notion is elusive, lacks a widely-accepted definition and depends on the deployment context. It can refer to predictive models whose structure adheres to domain-specific constraints, or ones that are inherently transparent. The latter notion assumes observers who judge this quality, whereas the former presupposes technical and domain expertise of them, in certain cases rendering such models unintelligible. Additionally, its distinction from the less desirable post-hoc explainability, which refers to methods that construct a separate explanatory model, is vague given that transparent models may still require (post-)processing to generate admissible explanatory insights. Ante-hoc interpretability is thus an overloaded concept that spans a range of implicit properties, which we unpack in this paper to better understand what is needed for its safe deployment across high-stakes domains. To this end, we outline model- and explainer-specific desiderata that allow us to navigate its distinct realisations in view of the envisaged application domain and audience.

Video

Chat is not available.