Timezone: »
We propose Automatic Feature Explanation using Contrasting Concepts (FALCON), an interpretability framework to explain features of image representations. For a target feature, FALCON captions its highly activating cropped images using a large captioning dataset (like LAION-400m) and a pre-trained vision-language model like CLIP. Each word among the captions is scored and ranked leading to a small number of shared, human-understandable concepts that closely describe the target feature. FALCON also applies contrastive interpretation using lowly activating (counterfactual) images, to eliminate spurious concepts. Although many existing approaches interpret features independently, we observe in state-of-the-art self-supervised and supervised models, that less than 20% of the representation space can be explained by individual features. We show that features in larger spaces become more interpretable when studied in groups and can be explained with high-order scoring concepts through FALCON. We discuss how extracted concepts can be used to explain and debug failures in downstream tasks. Finally, we present a technique to transfer concepts from one (explainable) representation space to another unseen representation space by learning a simple linear transformation.
Author Information
Neha Mukund Kalibhat (University of Maryland)
Shweta Bhardwaj (University of Maryland College Park)
C. Bayan Bruss (Capital One)
Hamed Firooz (Facebook)
Maziar Sanjabi (Meta AI)
Soheil Feizi (University of Maryland)
More from the Same Authors
-
2022 : Towards Better Understanding of Self-Supervised Representations »
Neha Mukund Kalibhat · Kanika Narang · Hamed Firooz · Maziar Sanjabi · Soheil Feizi -
2022 : BARACK: Partially Supervised Group Robustness With Guarantees »
Nimit Sohoni · Maziar Sanjabi · Nicolas Ballas · Aditya Grover · Shaoliang Nie · Hamed Firooz · Christopher Re -
2022 : Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation »
Wenxiao Wang · Alexander Levine · Soheil Feizi -
2022 : Certifiably Robust Multi-Agent Reinforcement Learning against Adversarial Communication »
Yanchao Sun · Ruijie Zheng · Parisa Hassanzadeh · Yongyuan Liang · Soheil Feizi · Sumitra Ganesh · Furong Huang -
2023 : Understanding the Detrimental Class-level Effects of Data Augmentation »
Polina Kirichenko · Mark Ibrahim · Randall Balestriero · Diane Bouchacourt · Ramakrishna Vedantam · Hamed Firooz · Andrew Wilson -
2023 Poster: Run-off Election: Improved Provable Defense against Data Poisoning Attacks »
Keivan Rezaei · Kiarash Banihashem · Atoosa Malemir Chegini · Soheil Feizi -
2023 Poster: Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano »
Chuan Guo · Alexandre Sablayrolles · Maziar Sanjabi -
2023 Poster: GOAT: A Global Transformer on Large-scale Graphs »
Kezhi Kong · Jiuhai Chen · John Kirchenbauer · Renkun Ni · C. Bayan Bruss · Tom Goldstein -
2023 Poster: Text-To-Concept (and Back) via Cross-Model Alignment »
Mazda Moayeri · Keivan Rezaei · Maziar Sanjabi · Soheil Feizi -
2022 : Panel discussion »
Steffen Schneider · Aleksander Madry · Alexei Efros · Chelsea Finn · Soheil Feizi -
2022 : Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation »
Wenxiao Wang · Alexander Levine · Soheil Feizi -
2022 : Toward Efficient Robust Training against Union of Lp Threat Models »
Gaurang Sriramanan · Maharshi Gor · Soheil Feizi -
2022 Poster: Federated Learning with Partial Model Personalization »
Krishna Pillutla · Kshitiz Malik · Abdel-rahman Mohamed · Michael Rabbat · Maziar Sanjabi · Lin Xiao -
2022 Spotlight: Federated Learning with Partial Model Personalization »
Krishna Pillutla · Kshitiz Malik · Abdel-rahman Mohamed · Michael Rabbat · Maziar Sanjabi · Lin Xiao -
2022 Poster: Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation »
Wenxiao Wang · Alexander Levine · Soheil Feizi -
2022 Poster: FOCUS: Familiar Objects in Common and Uncommon Settings »
Priyatham Kattakinda · Soheil Feizi -
2022 Spotlight: Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation »
Wenxiao Wang · Alexander Levine · Soheil Feizi -
2022 Spotlight: FOCUS: Familiar Objects in Common and Uncommon Settings »
Priyatham Kattakinda · Soheil Feizi -
2022 Poster: UNIREX: A Unified Learning Framework for Language Model Rationale Extraction »
Aaron Chan · Maziar Sanjabi · Lambert Mathias · Liang Tan · Shaoliang Nie · Xiaochang Peng · Xiang Ren · Hamed Firooz -
2022 Spotlight: UNIREX: A Unified Learning Framework for Language Model Rationale Extraction »
Aaron Chan · Maziar Sanjabi · Lambert Mathias · Liang Tan · Shaoliang Nie · Xiaochang Peng · Xiang Ren · Hamed Firooz -
2021 : Invited Talk 6: T​owards Understanding Foundations of Robust Learning »
Soheil Feizi -
2021 Poster: Improved, Deterministic Smoothing for L_1 Certified Robustness »
Alexander Levine · Soheil Feizi -
2021 Poster: Skew Orthogonal Convolutions »
Sahil Singla · Soheil Feizi -
2021 Spotlight: Skew Orthogonal Convolutions »
Sahil Singla · Soheil Feizi -
2021 Oral: Improved, Deterministic Smoothing for L_1 Certified Robustness »
Alexander Levine · Soheil Feizi -
2020 Poster: Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness »
Aounon Kumar · Alexander Levine · Tom Goldstein · Soheil Feizi -
2020 Poster: Second-Order Provable Defenses against Adversarial Attacks »
Sahil Singla · Soheil Feizi -
2020 Poster: On Second-Order Group Influence Functions for Black-Box Predictions »
Samyadeep Basu · Xuchen You · Soheil Feizi -
2019 Poster: Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation »
Sahil Singla · Eric Wallace · Shi Feng · Soheil Feizi -
2019 Oral: Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation »
Sahil Singla · Eric Wallace · Shi Feng · Soheil Feizi -
2019 Poster: Entropic GANs meet VAEs: A Statistical Approach to Compute Sample Likelihoods in GANs »
Yogesh Balaji · Hamed Hassani · Rama Chellappa · Soheil Feizi -
2019 Oral: Entropic GANs meet VAEs: A Statistical Approach to Compute Sample Likelihoods in GANs »
Yogesh Balaji · Hamed Hassani · Rama Chellappa · Soheil Feizi