Timezone: »
The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.
Author Information
Scott Cheng-Hsin Yang (Rutgers University)
Nils Erik Tomas Folke (Rutgers)
I am a postdoctoral associate in the department of Mathematics and Computer Science at Rutgers University - Newark. I earned my Ph.D. from the University of Cambridge, studying the role of confidence in perceptual and value-based decisions. I then spent one year as a postdoc in Judge Business School at the University of Cambridge working on various projects related to how behavioral science can improve public policy. I extended this work at the Department of Policy and Management at Columbia University, where I investigated what factors predict preventive healthcare appointment attendance, before and during COVID-19, and how we can use machine learning to encourage people to attend. I am interested in how psychology and machine learning can be applied to improve real- world decision-making. I am particularly interested in how to build better decision-support systems by improving humans’ mental models of AI and improving AIs’ models of humans. The long-term aim of this work would be for a decision-support system to have a distinct model of every user, and for these models to update dynamically as the system learns more about each user.
Patrick Shafto (IAS / Rutgers University)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: A Psychological Theory of Explainability »
Thu. Jul 21st through Fri the 22nd Room Hall E #1015
More from the Same Authors
-
2023 Poster: Coupled Variational Autoencoder »
Xiaoran Hao · Patrick Shafto · Patrick Shafto -
2022 Poster: Discrete Probabilistic Inverse Optimal Transport »
Wei-Ting Chiu · Pei Wang · Patrick Shafto -
2022 Spotlight: Discrete Probabilistic Inverse Optimal Transport »
Wei-Ting Chiu · Pei Wang · Patrick Shafto -
2021 Poster: Interactive Learning from Activity Description »
Khanh Nguyen · Dipendra Misra · Robert Schapire · Miroslav Dudik · Patrick Shafto -
2021 Spotlight: Interactive Learning from Activity Description »
Khanh Nguyen · Dipendra Misra · Robert Schapire · Miroslav Dudik · Patrick Shafto -
2020 Poster: Sequential Cooperative Bayesian Inference »
Junqi Wang · Pei Wang · Patrick Shafto