Timezone: »

Workshop on Human Interpretability in Machine Learning (WHI)
Kush Varshney · Adrian Weller · Been Kim · Dmitry Malioutov

Wed Aug 09 03:30 PM -- 12:30 AM (PDT) @ C4.8
Event URL: https://sites.google.com/view/whi2017/home »

This workshop will bring together researchers who study the interpretability of predictive models, develop interpretable machine learning algorithms, and develop methodology to interpret black-box machine learning models (e.g., post-hoc interpretations). This is a very exciting time to study interpretable machine learning, as the advances in large-scale optimization and Bayesian inference that have enabled the rise of black-box machine learning are now also starting to be exploited to develop principled approaches to large-scale interpretable machine learning. Participants in the workshop will exchange ideas on these and allied topics, including:

● Quantifying and axiomatizing interpretability
● Psychology of human concept learning
● Rule learning,Symbolic regression and case-based reasoning
● Generalized additive models, sparsity and interpretability
● Visual analytics
● Interpretable unsupervised models (clustering, topic models, e.t.c)
● Interpretation of black-box models (including deep neural networks)
● Causality of predictive models
● Verifying, diagnosing and debugging machine learning systems
● Interpretability in reinforcement learning.

Doctors, judges, business executives, and many other people are faced with making critical decisions that can have profound consequences. For example, doctors decide which treatment to administer to patients, judges decide on prison sentences for convicts, and business executives decide to enter new markets and acquire other companies. Such decisions are increasingly being supported by predictive models learned by algorithms from historical data.

The latest trend in machine learning is to use highly nonlinear complex systems such as deep neural networks, kernel methods, and large ensembles of diverse classifiers. While such approaches often produce impressive, state-of-the art prediction accuracies, their black-box nature offers little comfort to decision makers. Therefore, in order for predictions to be adopted, trusted, and safely used by decision makers in mission-critical applications, it is imperative to develop machine learning methods that produce interpretable models with excellent predictive accuracy. It is in this way that machine learning methods can have impact on consequential real-world applications.

Wed 3:30 p.m. - 3:45 p.m.

We provide a novel notion of what it means to be interpretable, looking past the usual association with human understanding. Our key insight is that interpretability is not an absolute concept and so we define it relative to a target model, which may or may not be a human. We define a framework that allows for comparing interpretable procedures by linking it to important practical aspects such as accuracy and robustness. We characterize many of the current state-of-the-art interpretable methods in our framework portraying its general applicability.

Karthikeyan Shanmugam
Wed 3:45 p.m. - 4:00 p.m.

In this work we present the novel ASTRID method for investigating which attribute interactions classifiers exploit when making predictions. Attribute interactions in classification tasks mean that two or more attributes together provide stronger evidence for a particular class label. Knowledge of such interactions makes models more interpretable by revealing associations between attributes. This has applications, e.g., in pharmacovigilance to identify interactions between drugs or in bioinformatics to investigate associations between single nucleotide polymorphisms. We also show how the found attribute partitioning is related to a factorisation of the data generating distribution and empirically demonstrate the utility of the proposed method.

Andreas Henelius
Wed 4:00 p.m. - 4:15 p.m.

It is critical in many applications to understand what features are important for a model, and why individual predictions were made. For tree ensemble methods these questions are usually answered by attributing importance values to input features, either globally or for a single prediction. Here we show that current feature attribution methods are inconsistent, which means changing the model to rely more on a given feature can actually decrease the importance assigned to that feature. To address this problem we develop fast exact solutions for SHAP (SHapley Additive exPlanation) values, which were recently shown to be the unique additive feature attribution method based on conditional expectations that is both consistent and locally accurate. We integrate these improvements into the latest version of XGBoost, demonstrate the inconsistencies of current methods, and show how using SHAP values results in significantly improved supervised clustering performance. Feature importance values are a key part of understanding widely used models such as gradient boosting trees and random forests. We believe our work improves on the state-of-the-art in important ways, and may impact any current user of tree ensemble methods.

Nao Hiranuma
Wed 4:15 p.m. - 5:00 p.m.
Invited Talk: D. Sontag (Invited Talk)
Wed 5:30 p.m. - 5:45 p.m.

Explaining and reasoning about processes which underlie observed black-box phenomena enables the discovery of causal mechanisms, derivation of suitable abstract representations and the formulation of more robust predictions. We propose to learn high level functional programs in order to represent abstract models which capture the invariant structure in the observed data. We introduce the π-machine (program-induction machine) -- an architecture able to induce interpretable LISP-like programs from observed data traces. We propose an optimisation procedure for program learning based on backpropagation, gradient descent and A* search. We apply the proposed method to two problems: system identification of dynamical systems and explaining the behaviour of a DQN agent. Our results show that the π-machine can efficiently induce interpretable programs from individual data traces.

Svet Penkov
Wed 5:45 p.m. - 6:00 p.m.

Active learning has long been a topic of study in machine learning. However, as increasingly complex and opaque models have become standard practice, the process of active learning, too, has become more opaque. There has been little investigation into interpreting what specific trends and patterns an active learning strategy may be exploring. This work expands on the Local Interpretable Model-agnostic Explanations framework (LIME) to provide explanations for active learning recommendations. We demonstrate how LIME can be used to generate locally faithful explanations for an active learning strategy, and how these explanations can be used to understand how different models and datasets explore a problem space over time. In order to quantify the per-subgroup differences in how an active learning strategy queries spatial regions, we introduce a notion of uncertainty bias (based on disparate impact) to measure the discrepancy in the confidence for a model's predictions between one subgroup and another. Using the uncertainty bias measure, we show that our query explanations accurately reflect the subgroup focus of the active learning queries, allowing for an interpretable explanation of what is being learned as points with similar sources of uncertainty have their uncertainty bias resolved. We demonstrate that this technique can be applied to track uncertainty bias over user-defined clusters or automatically generated clusters based on the source of uncertainty.

Richard L. Phillips
Wed 6:00 p.m. - 6:15 p.m.

In this paper we present a new dataset and user simulator e-QRAQ (explainable Query, Reason, and Answer Question) which tests an Agent's ability to read an ambiguous text; ask questions until it can answer a challenge question; and explain the reasoning behind its questions and answer. The User simulator provides the Agent with a short, ambiguous story and a challenge question about the story. The story is ambiguous because some of the entities have been replaced by variables. At each turn the Agent may ask for the value of a variable or try to answer the challenge question. In response the User simulator provides a natural language explanation of why the Agent's query or answer was useful in narrowing down the set of possible answers, or not. To demonstrate one potential application of the e-QRAQ dataset, we train a new neural architecture based on End-to-End Memory Networks to successfully generate both predictions and partial explanations of its current understanding of the problem. We observe a strong correlation between the quality of the prediction and explanation.

Wed 6:15 p.m. - 7:00 p.m.

While interpretability often involves finding more parsimonious or sparser models to facilitate human understanding, Netflix also seeks to achieve human interpretability by pursuing causal learning. Predictive models can be impressively accurate in a passive setting but might disappoint a human user who expects the recovered relationships to be causal. More importantly, a predictive model's outcomes may no longer be accurate if the input variables are perturbed through an active intervention. I will briefly discuss applications at Netflix across messaging, marketing and originals promotion which leverage causal modeling in order to achieve models that can be actionable as well as interpretable. In particular, techniques such as two stage least squares (2SLS), instrumental variables (IV), extensions to generalized linear models (GLMs), and other causal methods will be summarized. These causal models can surprisingly recover more interpretable and simpler models than their purely predictive counterparts. Furthermore, sparsity can potentially emerge when causal models ignore spurious relationships that might otherwise be recovered in a purely predictive objective function. In general, causal models achieve better results algorithmically in active intervention settings and enjoy broader adoption from human stakeholders.

Wed 9:00 p.m. - 9:15 p.m.

We consider the problem of estimating a regression function in the common situation where the number of features is small, where interpretability of the model is a high priority, and where simple linear or additive models fail to provide adequate performance. To address this problem, we present Maximum Variance Total Variation denoising (MVTV), an approach that is conceptually related both to CART and to the more recent CRISP algorithm, a state-of-the-art alternative method for interpretable nonlinear regression. MVTV divides the feature space into blocks of constant value and fits the value of all blocks jointly via a convex optimization routine. Our method is fully data-adaptive, in that it incorporates highly robust routines for tuning all hyperparameters automatically. We compare our approach against CART and CRISP via both a complexity-accuracy tradeoff metric and a human study, demonstrating that that MVTV is a more powerful and interpretable method.

Wesley Tansey
Wed 9:15 p.m. - 10:00 p.m.
Invited Talk: P. W. Koh (Invited Talk)
Wed 10:30 p.m. - 10:45 p.m.

This paper introduces a general Bayesian non- parametric latent feature model suitable to per- form automatic exploratory analysis of heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables. The proposed model presents several important properties. First, it accounts for heterogeneous data while can be inferred in linear time with respect to the number of objects and attributes. Second, its Bayesian nonparametric nature allows us to automatically infer the model complexity from the data, i.e., the number of features necessary to capture the latent structure in the data. Third, the latent features in the model are binary-valued variables, easing the interpretability of the obtained latent features in data exploration tasks.

Wed 10:45 p.m. - 11:00 p.m.

Transparency is often deemed critical to enable effective real-world deployment of intelligent systems. Yet the motivations for and benefits of different types of transparency can vary significantly depending on context, and objective measurement criteria are difficult to identify. We provide a brief survey, suggesting challenges and related concerns. We highlight and review settings where transparency may cause harm, discussing connections across privacy, multi-agent game theory, economics, fairness and trust.

Adrian Weller
Wed 11:00 p.m. - 11:05 p.m.

Join us in recognizing the best papers of the workshop.

Wed 11:05 p.m. - 12:00 a.m.

panelists: Tony Jebara, Bernhard Schölkopf, Been Kim, Kush Varshney moderator: Adrian Weller

Author Information

Kush Varshney (IBM Research AI)
Adrian Weller (University of Cambridge, Alan Turing Institute)

Adrian Weller is a Senior Research Fellow in the Machine Learning Group at the University of Cambridge, a Faculty Fellow at the Alan Turing Institute for data science and an Executive Fellow at the Leverhulme Centre for the Future of Intelligence (CFI). He is very interested in all aspects of artificial intelligence, its commercial applications and how it may be used to benefit society. At the CFI, he leads their project on Trust and Transparency. Previously, Adrian held senior roles in finance. He received a PhD in computer science from Columbia University, and an undergraduate degree in mathematics from Trinity College, Cambridge.

Been Kim (Google Brain)
Dmitry Malioutov (The D. E. Shaw Group)

Dmitry Malioutov is a research staff member at IBM TJ Watson research center.

More from the Same Authors