Timezone: »
This workshop will bring together researchers who study the interpretability of predictive models, develop interpretable machine learning algorithms, and develop methodology to interpret black-box machine learning models (e.g., post-hoc interpretations). This is a very exciting time to study interpretable machine learning, as the advances in large-scale optimization and Bayesian inference that have enabled the rise of black-box machine learning are now also starting to be exploited to develop principled approaches to large-scale interpretable machine learning. Participants in the workshop will exchange ideas on these and allied topics, including:
● Quantifying and axiomatizing interpretability
● Psychology of human concept learning
● Rule learning,Symbolic regression and case-based reasoning
● Generalized additive models, sparsity and interpretability
● Visual analytics
● Interpretable unsupervised models (clustering, topic models, e.t.c)
● Interpretation of black-box models (including deep neural networks)
● Causality of predictive models
● Verifying, diagnosing and debugging machine learning systems
● Interpretability in reinforcement learning.
Doctors, judges, business executives, and many other people are faced with making critical decisions that can have profound consequences. For example, doctors decide which treatment to administer to patients, judges decide on prison sentences for convicts, and business executives decide to enter new markets and acquire other companies. Such decisions are increasingly being supported by predictive models learned by algorithms from historical data.
The latest trend in machine learning is to use highly nonlinear complex systems such as deep neural networks, kernel methods, and large ensembles of diverse classifiers. While such approaches often produce impressive, state-of-the art prediction accuracies, their black-box nature offers little comfort to decision makers. Therefore, in order for predictions to be adopted, trusted, and safely used by decision makers in mission-critical applications, it is imperative to develop machine learning methods that produce interpretable models with excellent predictive accuracy. It is in this way that machine learning methods can have impact on consequential real-world applications.
Wed 3:30 p.m. - 3:45 p.m.
|
A. Dhurandhar, V. Iyengar, R. Luss, and K. Shanmugam, "A Formal Framework to Characterize Interpretability of Procedures"
(
Contributed Talk
)
We provide a novel notion of what it means to be interpretable, looking past the usual association with human understanding. Our key insight is that interpretability is not an absolute concept and so we define it relative to a target model, which may or may not be a human. We define a framework that allows for comparing interpretable procedures by linking it to important practical aspects such as accuracy and robustness. We characterize many of the current state-of-the-art interpretable methods in our framework portraying its general applicability. |
Karthikeyan Shanmugam 🔗 |
Wed 3:45 p.m. - 4:00 p.m.
|
A. Henelius, K. Puolamäki, and A. Ukkonen, "Interpreting Classifiers through Attribute Interactions in Datasets"
(
Contributed Talk
)
In this work we present the novel ASTRID method for investigating which attribute interactions classifiers exploit when making predictions. Attribute interactions in classification tasks mean that two or more attributes together provide stronger evidence for a particular class label. Knowledge of such interactions makes models more interpretable by revealing associations between attributes. This has applications, e.g., in pharmacovigilance to identify interactions between drugs or in bioinformatics to investigate associations between single nucleotide polymorphisms. We also show how the found attribute partitioning is related to a factorisation of the data generating distribution and empirically demonstrate the utility of the proposed method. |
Andreas Henelius 🔗 |
Wed 4:00 p.m. - 4:15 p.m.
|
S. Lundberg and S.-I. Lee, "Consistent Feature Attribution for Tree Ensembles"
(
Contributed Talk
)
It is critical in many applications to understand what features are important for a model, and why individual predictions were made. For tree ensemble methods these questions are usually answered by attributing importance values to input features, either globally or for a single prediction. Here we show that current feature attribution methods are inconsistent, which means changing the model to rely more on a given feature can actually decrease the importance assigned to that feature. To address this problem we develop fast exact solutions for SHAP (SHapley Additive exPlanation) values, which were recently shown to be the unique additive feature attribution method based on conditional expectations that is both consistent and locally accurate. We integrate these improvements into the latest version of XGBoost, demonstrate the inconsistencies of current methods, and show how using SHAP values results in significantly improved supervised clustering performance. Feature importance values are a key part of understanding widely used models such as gradient boosting trees and random forests. We believe our work improves on the state-of-the-art in important ways, and may impact any current user of tree ensemble methods. |
Naozumi Hiranuma 🔗 |
Wed 4:15 p.m. - 5:00 p.m.
|
Invited Talk: D. Sontag
(
Invited Talk
)
|
🔗 |
Wed 5:30 p.m. - 5:45 p.m.
|
S. Penkov and S. Ramamoorthy, "Program Induction to Interpret Transition Systems"
(
Contributed Talk
)
Explaining and reasoning about processes which underlie observed black-box phenomena enables the discovery of causal mechanisms, derivation of suitable abstract representations and the formulation of more robust predictions. We propose to learn high level functional programs in order to represent abstract models which capture the invariant structure in the observed data. We introduce the π-machine (program-induction machine) -- an architecture able to induce interpretable LISP-like programs from observed data traces. We propose an optimisation procedure for program learning based on backpropagation, gradient descent and A* search. We apply the proposed method to two problems: system identification of dynamical systems and explaining the behaviour of a DQN agent. Our results show that the π-machine can efficiently induce interpretable programs from individual data traces. |
Svetlin Penkov 🔗 |
Wed 5:45 p.m. - 6:00 p.m.
|
R. L. Phillips, K. H. Chang, and S. Friedler, "Interpretable Active Learning"
(
Contributed Talk
)
Active learning has long been a topic of study in machine learning. However, as increasingly complex and opaque models have become standard practice, the process of active learning, too, has become more opaque. There has been little investigation into interpreting what specific trends and patterns an active learning strategy may be exploring. This work expands on the Local Interpretable Model-agnostic Explanations framework (LIME) to provide explanations for active learning recommendations. We demonstrate how LIME can be used to generate locally faithful explanations for an active learning strategy, and how these explanations can be used to understand how different models and datasets explore a problem space over time. In order to quantify the per-subgroup differences in how an active learning strategy queries spatial regions, we introduce a notion of uncertainty bias (based on disparate impact) to measure the discrepancy in the confidence for a model's predictions between one subgroup and another. Using the uncertainty bias measure, we show that our query explanations accurately reflect the subgroup focus of the active learning queries, allowing for an interpretable explanation of what is being learned as points with similar sources of uncertainty have their uncertainty bias resolved. We demonstrate that this technique can be applied to track uncertainty bias over user-defined clusters or automatically generated clusters based on the source of uncertainty. |
Richard Phillips 🔗 |
Wed 6:00 p.m. - 6:15 p.m.
|
C. Rosenbaum, T. Gao, and T. Klinger, "e-QRAQ: A Multi-turn Reasoning Dataset and Simulator with Explanations"
(
Contributed Talk
)
In this paper we present a new dataset and user simulator e-QRAQ (explainable Query, Reason, and Answer Question) which tests an Agent's ability to read an ambiguous text; ask questions until it can answer a challenge question; and explain the reasoning behind its questions and answer. The User simulator provides the Agent with a short, ambiguous story and a challenge question about the story. The story is ambiguous because some of the entities have been replaced by variables. At each turn the Agent may ask for the value of a variable or try to answer the challenge question. In response the User simulator provides a natural language explanation of why the Agent's query or answer was useful in narrowing down the set of possible answers, or not. To demonstrate one potential application of the e-QRAQ dataset, we train a new neural architecture based on End-to-End Memory Networks to successfully generate both predictions and partial explanations of its current understanding of the problem. We observe a strong correlation between the quality of the prediction and explanation. |
🔗 |
Wed 6:15 p.m. - 7:00 p.m.
|
Invited Talk: T. Jebara, "Interpretability Through Causality"
(
Invited Talk
)
While interpretability often involves finding more parsimonious or sparser models to facilitate human understanding, Netflix also seeks to achieve human interpretability by pursuing causal learning. Predictive models can be impressively accurate in a passive setting but might disappoint a human user who expects the recovered relationships to be causal. More importantly, a predictive model's outcomes may no longer be accurate if the input variables are perturbed through an active intervention. I will briefly discuss applications at Netflix across messaging, marketing and originals promotion which leverage causal modeling in order to achieve models that can be actionable as well as interpretable. In particular, techniques such as two stage least squares (2SLS), instrumental variables (IV), extensions to generalized linear models (GLMs), and other causal methods will be summarized. These causal models can surprisingly recover more interpretable and simpler models than their purely predictive counterparts. Furthermore, sparsity can potentially emerge when causal models ignore spurious relationships that might otherwise be recovered in a purely predictive objective function. In general, causal models achieve better results algorithmically in active intervention settings and enjoy broader adoption from human stakeholders. |
🔗 |
Wed 9:00 p.m. - 9:15 p.m.
|
W. Tansey, J. Thomason, and J. G. Scott, "Interpretable Low-Dimensional Regression via Data-Adaptive Smoothing"
(
Contributed Talk
)
We consider the problem of estimating a regression function in the common situation where the number of features is small, where interpretability of the model is a high priority, and where simple linear or additive models fail to provide adequate performance. To address this problem, we present Maximum Variance Total Variation denoising (MVTV), an approach that is conceptually related both to CART and to the more recent CRISP algorithm, a state-of-the-art alternative method for interpretable nonlinear regression. MVTV divides the feature space into blocks of constant value and fits the value of all blocks jointly via a convex optimization routine. Our method is fully data-adaptive, in that it incorporates highly robust routines for tuning all hyperparameters automatically. We compare our approach against CART and CRISP via both a complexity-accuracy tradeoff metric and a human study, demonstrating that that MVTV is a more powerful and interpretable method. |
Wesley Tansey 🔗 |
Wed 9:15 p.m. - 10:00 p.m.
|
Invited Talk: P. W. Koh
(
Invited Talk
)
|
🔗 |
Wed 10:30 p.m. - 10:45 p.m.
|
I. Valera, M. F. Pradier, and Z. Ghahramani, "General Latent Feature Modeling for Data Exploration Tasks"
(
Contributed Talk
)
This paper introduces a general Bayesian non- parametric latent feature model suitable to per- form automatic exploratory analysis of heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables. The proposed model presents several important properties. First, it accounts for heterogeneous data while can be inferred in linear time with respect to the number of objects and attributes. Second, its Bayesian nonparametric nature allows us to automatically infer the model complexity from the data, i.e., the number of features necessary to capture the latent structure in the data. Third, the latent features in the model are binary-valued variables, easing the interpretability of the obtained latent features in data exploration tasks. |
🔗 |
Wed 10:45 p.m. - 11:00 p.m.
|
A. Weller, "Challenges for Transparency"
(
Contributed Talk
)
Transparency is often deemed critical to enable effective real-world deployment of intelligent systems. Yet the motivations for and benefits of different types of transparency can vary significantly depending on context, and objective measurement criteria are difficult to identify. We provide a brief survey, suggesting challenges and related concerns. We highlight and review settings where transparency may cause harm, discussing connections across privacy, multi-agent game theory, economics, fairness and trust. |
Adrian Weller 🔗 |
Wed 11:00 p.m. - 11:05 p.m.
|
ICML WHI 2017 Awards Ceremony
(
Awards Ceremony
)
Join us in recognizing the best papers of the workshop. |
🔗 |
Wed 11:05 p.m. - 12:00 a.m.
|
Panel Discussion: Human Interpretability in Machine Learning
(
Panel Discussion
)
panelists: Tony Jebara, Bernhard Schölkopf, Been Kim, Kush Varshney moderator: Adrian Weller |
🔗 |
Author Information
Kush Varshney (IBM Research AI)
Adrian Weller (University of Cambridge, Alan Turing Institute)

Adrian Weller is Programme Director for AI at The Alan Turing Institute, the UK national institute for data science and AI, and is a Turing AI Fellow leading work on trustworthy Machine Learning (ML). He is a Principal Research Fellow in ML at the University of Cambridge, and at the Leverhulme Centre for the Future of Intelligence where he is Programme Director for Trust and Society. His interests span AI, its commercial applications and helping to ensure beneficial outcomes for society. Previously, Adrian held senior roles in finance. He received a PhD in computer science from Columbia University, and an undergraduate degree in mathematics from Trinity College, Cambridge.
Been Kim (Google Brain)
Dmitry Malioutov (The D. E. Shaw Group)
Dmitry Malioutov is a research staff member at IBM TJ Watson research center.
More from the Same Authors
-
2021 : Diverse and Amortised Counterfactual Explanations for Uncertainty Estimates »
· Dan Ley · Umang Bhatt · Adrian Weller -
2021 : Diverse and Amortised Counterfactual Explanations for Uncertainty Estimates »
Dan Ley · Umang Bhatt · Adrian Weller -
2021 : On the Fairness of Causal Algorithmic Recourse »
Julius von Kügelgen · Amir-Hossein Karimi · Umang Bhatt · Isabel Valera · Adrian Weller · Bernhard Schölkopf · Amir-Hossein Karimi -
2021 : Towards Principled Disentanglement for Domain Generalization »
Hanlin Zhang · Yi-Fan Zhang · Weiyang Liu · Adrian Weller · Bernhard Schölkopf · Eric Xing -
2021 : Diverse and Amortised Counterfactual Explanations for Uncertainty Estimates »
Dan Ley · Umang Bhatt · Adrian Weller -
2021 : CrossWalk: Fairness-enhanced Node Representation Learning »
Ahmad Khajehnejad · Moein Khajehnejad · Krishna Gummadi · Adrian Weller · Baharan Mirzasoleiman -
2022 : Perspectives on Incorporating Expert Feedback into Model Updates »
Valerie Chen · Umang Bhatt · Hoda Heidari · Adrian Weller · Ameet Talwalkar -
2023 : Don't trust your eyes: on the (un)reliability of feature visualizations »
Robert Geirhos · Roland S. Zimmermann · Blair Bilodeau · Wieland Brendel · Been Kim -
2023 : Algorithms for Optimal Adaptation of Diffusion Models to Reward Functions »
Krishnamurthy Dvijotham · Shayegan Omidshafiei · Kimin Lee · Katie Collins · Deepak Ramachandran · Adrian Weller · Mohammad Ghavamzadeh · Milad Nasresfahani · Ying Fan · Jeremiah Liu -
2023 : TabCBM: Concept-based Interpretable Neural Networks for Tabular Data »
Mateo Espinosa Zarlenga · Zohreh Shams · Michael Nelson · Been Kim · Mateja Jamnik -
2023 : The Neuro-Symbolic Inverse Planning Engine (NIPE): Modeling probabilistic social inferences from linguistic inputs »
Lance Ying · Katie Collins · Megan Wei · Cedegao Zhang · Tan Zhi-Xuan · Adrian Weller · Josh Tenenbaum · Catherine Wong -
2023 Oral: Simplex Random Features »
Isaac Reid · Krzysztof Choromanski · Valerii Likhosherstov · Adrian Weller -
2023 Poster: Efficient Graph Field Integrators Meet Point Clouds »
Krzysztof Choromanski · Arijit Sehanobish · Han Lin · YUNFAN ZHAO · Eli Berger · Tetiana Parshakova · Qingkai Pan · David Watkins · Tianyi Zhang · Valerii Likhosherstov · Somnath Basu Roy Chowdhury · Kumar Avinava Dubey · Deepali Jain · Tamas Sarlos · Snigdha Chaturvedi · Adrian Weller -
2023 Poster: Simplex Random Features »
Isaac Reid · Krzysztof Choromanski · Valerii Likhosherstov · Adrian Weller -
2023 Poster: On the Relationship Between Explanation and Prediction: A Causal View »
Amir-Hossein Karimi · Krikamol Muandet · Simon Kornblith · Bernhard Schölkopf · Been Kim -
2023 Poster: Is Learning Summary Statistics Necessary for Likelihood-free Inference? »
Yanzhi Chen · Michael Gutmann · Adrian Weller -
2022 : Spotlight Presentations »
Adrian Weller · Osbert Bastani · Jake Snell · Tal Schuster · Stephen Bates · Zhendong Wang · Margaux Zaffran · Danielle Rasooly · Varun Babbar -
2022 Workshop: Workshop on Human-Machine Collaboration and Teaming »
Umang Bhatt · Katie Collins · Maria De-Arteaga · Bradley Love · Adrian Weller -
2022 Poster: From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers »
Krzysztof Choromanski · Han Lin · Haoxian Chen · Tianyi Zhang · Arijit Sehanobish · Valerii Likhosherstov · Jack Parker-Holder · Tamas Sarlos · Adrian Weller · Thomas Weingarten -
2022 Poster: Measuring Representational Robustness of Neural Networks Through Shared Invariances »
Vedant Nanda · Till Speicher · Camila Kolling · John P Dickerson · Krishna Gummadi · Adrian Weller -
2022 Oral: Measuring Representational Robustness of Neural Networks Through Shared Invariances »
Vedant Nanda · Till Speicher · Camila Kolling · John P Dickerson · Krishna Gummadi · Adrian Weller -
2022 Spotlight: From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers »
Krzysztof Choromanski · Han Lin · Haoxian Chen · Tianyi Zhang · Arijit Sehanobish · Valerii Likhosherstov · Jack Parker-Holder · Tamas Sarlos · Adrian Weller · Thomas Weingarten -
2021 : Invited Talk: Kush Varshney »
Kush Varshney -
2021 Poster: Debiasing a First-order Heuristic for Approximate Bi-level Optimization »
Valerii Likhosherstov · Xingyou Song · Krzysztof Choromanski · Jared Quincy Davis · Adrian Weller -
2021 Spotlight: Debiasing a First-order Heuristic for Approximate Bi-level Optimization »
Valerii Likhosherstov · Xingyou Song · Krzysztof Choromanski · Jared Quincy Davis · Adrian Weller -
2020 Workshop: 5th ICML Workshop on Human Interpretability in Machine Learning (WHI) »
Adrian Weller · Alice Xiang · Amit Dhurandhar · Been Kim · Dennis Wei · Kush Varshney · Umang Bhatt -
2020 Poster: Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing »
Sanghamitra Dutta · Dennis Wei · Hazar Yueksel · Pin-Yu Chen · Sijia Liu · Kush Varshney -
2020 Poster: Stochastic Flows and Geometric Optimization on the Orthogonal Group »
Krzysztof Choromanski · David Cheikhi · Jared Quincy Davis · Valerii Likhosherstov · Achille Nazaret · Achraf Bahamou · Xingyou Song · Mrugank Akarte · Jack Parker-Holder · Jacob Bergquist · Yuan Gao · Aldo Pacchiano · Tamas Sarlos · Adrian Weller · Vikas Sindhwani -
2020 Poster: Invariant Risk Minimization Games »
Kartik Ahuja · Karthikeyan Shanmugam · Kush Varshney · Amit Dhurandhar -
2019 Workshop: Human In the Loop Learning (HILL) »
Xin Wang · Xin Wang · Fisher Yu · Shanghang Zhang · Joseph Gonzalez · Yangqing Jia · Sarah Bird · Kush Varshney · Been Kim · Adrian Weller -
2019 Poster: Unifying Orthogonal Monte Carlo Methods »
Krzysztof Choromanski · Mark Rowland · Wenyu Chen · Adrian Weller -
2019 Oral: Unifying Orthogonal Monte Carlo Methods »
Krzysztof Choromanski · Mark Rowland · Wenyu Chen · Adrian Weller -
2019 Poster: TibGM: A Transferable and Information-Based Graphical Model Approach for Reinforcement Learning »
Tameem Adel · Adrian Weller -
2019 Poster: Topological Data Analysis of Decision Boundaries with Application to Model Selection »
Karthikeyan Ramamurthy · Kush Varshney · Krishnan Mody -
2019 Oral: Topological Data Analysis of Decision Boundaries with Application to Model Selection »
Karthikeyan Ramamurthy · Kush Varshney · Krishnan Mody -
2019 Oral: TibGM: A Transferable and Information-Based Graphical Model Approach for Reinforcement Learning »
Tameem Adel · Adrian Weller -
2018 Poster: Blind Justice: Fairness with Encrypted Sensitive Attributes »
Niki Kilbertus · Adria Gascon · Matt Kusner · Michael Veale · Krishna Gummadi · Adrian Weller -
2018 Oral: Blind Justice: Fairness with Encrypted Sensitive Attributes »
Niki Kilbertus · Adria Gascon · Matt Kusner · Michael Veale · Krishna Gummadi · Adrian Weller -
2018 Poster: Bucket Renormalization for Approximate Inference »
Sungsoo Ahn · Michael Chertkov · Adrian Weller · Jinwoo Shin -
2018 Oral: Bucket Renormalization for Approximate Inference »
Sungsoo Ahn · Michael Chertkov · Adrian Weller · Jinwoo Shin -
2018 Poster: Structured Evolution with Compact Architectures for Scalable Policy Optimization »
Krzysztof Choromanski · Mark Rowland · Vikas Sindhwani · Richard E Turner · Adrian Weller -
2018 Poster: Discovering Interpretable Representations for Both Deep Generative and Discriminative Models »
Tameem Adel · Zoubin Ghahramani · Adrian Weller -
2018 Oral: Discovering Interpretable Representations for Both Deep Generative and Discriminative Models »
Tameem Adel · Zoubin Ghahramani · Adrian Weller -
2018 Oral: Structured Evolution with Compact Architectures for Scalable Policy Optimization »
Krzysztof Choromanski · Mark Rowland · Vikas Sindhwani · Richard E Turner · Adrian Weller -
2017 Workshop: Reliable Machine Learning in the Wild »
Dylan Hadfield-Menell · Jacob Steinhardt · Adrian Weller · Smitha Milli -
2017 : A. Weller, "Challenges for Transparency" »
Adrian Weller -
2017 Poster: Lost Relatives of the Gumbel Trick »
Matej Balog · Nilesh Tripuraneni · Zoubin Ghahramani · Adrian Weller -
2017 Talk: Lost Relatives of the Gumbel Trick »
Matej Balog · Nilesh Tripuraneni · Zoubin Ghahramani · Adrian Weller -
2017 Tutorial: Interpretable Machine Learning »
Been Kim · Finale Doshi-Velez