Timezone: »
Out-of-distribution (OOD) detection plays a crucial role in ensuring the safe deployment of deep neural network (DNN) classifiers. While a myriad of methods have focused on improving the performance of OOD detectors, a critical gap remains in interpreting their decisions. We help bridge this gap by providing explanations for OOD detectors based on learned high-level concepts. We first propose two new metrics for assessing the effectiveness of a particular set of concepts for explaining OOD detectors: 1) detection completeness, which quantifies the sufficiency of concepts for explaining an OOD-detector's decisions, and 2) concept separability, which captures the distributional separation between in-distribution and OOD data in the concept space. Based on these metrics, we propose an unsupervised framework for learning a set of concepts that satisfy the desired properties of high detection completeness and concept separability, and demonstrate its effectiveness in providing concept-based explanations for diverse off-the-shelf OOD detectors. We also show how to identify prominent concepts contributing to the detection results, and provide further reasoning about their decisions.
Author Information
Jihye Choi (University of Wisconsin-Madison)
Jayaram Raghuram (University of Wisconsin, Madison)
Ryan Feng (University of Michigan)
Jiefeng Chen (University of Wisconsin-Madison)
Somesh Jha (University of Wisconsin, Madison)
Atul Prakash (University of Michigan, Ann Arbor)
More from the Same Authors
-
2021 : Using Anomaly Feature Vectors for Detecting, Classifying and Warning of Outlier Adversarial Examples »
Nelson Manohar-Alers · Ryan Feng · Sahib Singh · Jiguo Song · Atul Prakash -
2021 : A Shuffling Framework For Local Differential Privacy »
Casey M Meehan · Amrita Roy Chowdhury · Kamalika Chaudhuri · Somesh Jha -
2022 : The Trade-off between Label Efficiency and Universality of Representations from Contrastive Learning »
Zhenmei Shi · Zhenmei Shi · Jiefeng Chen · Jiefeng Chen · Kunyang Li · Kunyang Li · Jayaram Raghuram · Jayaram Raghuram · Xi Wu · Xi Wu · Yingyiu Liang · Yingyiu Liang · Somesh Jha · Somesh Jha -
2023 : Theoretically Principled Trade-off for Stateful Defenses against Query-Based Black-Box Attacks »
Ashish Hooda · Neal Mangaokar · Ryan Feng · Kassem Fawaz · Somesh Jha · Atul Prakash -
2023 Poster: Stratified Adversarial Robustness with Rejection »
Jiefeng Chen · Jayaram Raghuram · Jihye Choi · Xi Wu · Yingyiu Liang · Somesh Jha -
2022 : Adversarial Robustness and Cryptography »
Somesh Jha -
2022 : Robust physical perturbation attacks and defenses for deep learning visual classifiers »
Atul Prakash -
2021 Poster: A General Framework For Detecting Anomalous Inputs to DNN Classifiers »
Jayaram Raghuram · Varun Chandrasekaran · Somesh Jha · Suman Banerjee -
2021 Oral: A General Framework For Detecting Anomalous Inputs to DNN Classifiers »
Jayaram Raghuram · Varun Chandrasekaran · Somesh Jha · Suman Banerjee -
2021 Poster: Sample Complexity of Robust Linear Classification on Separated Data »
Robi Bhattacharjee · Somesh Jha · Kamalika Chaudhuri -
2021 Spotlight: Sample Complexity of Robust Linear Classification on Separated Data »
Robi Bhattacharjee · Somesh Jha · Kamalika Chaudhuri -
2020 Poster: Data-Dependent Differentially Private Parameter Learning for Directed Graphical Models »
Amrita Roy Chowdhury · Theodoros Rekatsinas · Somesh Jha -
2020 Poster: Concise Explanations of Neural Networks using Adversarial Training »
Prasad Chalasani · Jiefeng Chen · Amrita Roy Chowdhury · Xi Wu · Somesh Jha -
2020 Poster: CAUSE: Learning Granger Causality from Event Sequences using Attribution Methods »
Wei Zhang · Thomas Panum · Somesh Jha · Prasad Chalasani · David Page -
2019 Workshop: Workshop on the Security and Privacy of Machine Learning »
Nicolas Papernot · Florian Tramer · Bo Li · Dan Boneh · David Evans · Somesh Jha · Percy Liang · Patrick McDaniel · Jacob Steinhardt · Dawn Song -
2018 Poster: Analyzing the Robustness of Nearest Neighbors to Adversarial Examples »
Yizhen Wang · Somesh Jha · Kamalika Chaudhuri -
2018 Oral: Analyzing the Robustness of Nearest Neighbors to Adversarial Examples »
Yizhen Wang · Somesh Jha · Kamalika Chaudhuri -
2018 Poster: Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training »
Xi Wu · Wooyeong Jang · Jiefeng Chen · Lingjiao Chen · Somesh Jha -
2018 Oral: Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training »
Xi Wu · Wooyeong Jang · Jiefeng Chen · Lingjiao Chen · Somesh Jha