Timezone: »
Poster
Fairwashing explanations with off-manifold detergent
Christopher Anders · Plamen Pasliev · Ann-Kathrin Dombrowski · Klaus-robert Mueller · Pan Kessel
Wed Jul 15 11:00 AM -- 11:45 AM & Thu Jul 16 12:00 AM -- 12:45 AM (PDT) @
Explanation methods promise to make black-box classifiers more transparent.
As a result, it is hoped that they can act as proof for a sensible, fair and trustworthy decision-making process of the algorithm and thereby increase its acceptance by the end-users.
In this paper, we show both theoretically and experimentally that these hopes are presently unfounded.
Specifically, we show that, for any classifier $g$, one can always construct another classifier $\tilde{g}$ which has the same behavior on the data (same train, validation, and test error) but has arbitrarily manipulated explanation maps.
We derive this statement theoretically using differential geometry and demonstrate it experimentally for various explanation methods, architectures, and datasets.
Motivated by our theoretical insights, we then propose a modification of existing explanation methods which makes them significantly more robust.
Author Information
Christopher Anders (TU Berlin)
Plamen Pasliev (TU Berlin)
Ann-Kathrin Dombrowski (TU Berlin)
Klaus-robert Mueller (Technische Universität Berlin)
Pan Kessel (TU Berlin)
More from the Same Authors
-
2021 : Diffeomorphic Explanations with Normalizing Flows »
Ann-Kathrin Dombrowski -
2023 Poster: Relevant Walk Search for Explaining Graph Neural Networks »
Ping Xiong · Thomas Schnake · Michael Gastegger · Grégoire Montavon · Klaus-robert Mueller · Shinichi Nakajima -
2022 Poster: Path-Gradient Estimators for Continuous Normalizing Flows »
Lorenz Vaitl · Kim A. Nicoli · Shinichi Nakajima · Pan Kessel -
2022 Poster: Efficient Computation of Higher-Order Subgraph Attribution via Message Passing »
Ping Xiong · Thomas Schnake · Grégoire Montavon · Klaus-robert Mueller · Shinichi Nakajima -
2022 Poster: XAI for Transformers: Better Explanations through Conservative Propagation »
Ameen Ali · Thomas Schnake · Oliver Eberle · Grégoire Montavon · Klaus-robert Mueller · Lior Wolf -
2022 Spotlight: Efficient Computation of Higher-Order Subgraph Attribution via Message Passing »
Ping Xiong · Thomas Schnake · Grégoire Montavon · Klaus-robert Mueller · Shinichi Nakajima -
2022 Spotlight: XAI for Transformers: Better Explanations through Conservative Propagation »
Ameen Ali · Thomas Schnake · Oliver Eberle · Grégoire Montavon · Klaus-robert Mueller · Lior Wolf -
2022 Oral: Path-Gradient Estimators for Continuous Normalizing Flows »
Lorenz Vaitl · Kim A. Nicoli · Shinichi Nakajima · Pan Kessel -
2021 : [12:52 - 01:45 PM UTC] Invited Talk 2: Toward Explainable AI »
Klaus-robert Mueller · Wojciech Samek · Grégoire Montavon -
2020 Workshop: XXAI: Extending Explainable AI Beyond Deep Models and Classifiers »
Wojciech Samek · Andreas HOLZINGER · Ruth Fong · Taesup Moon · Klaus-robert Mueller -
2017 Poster: Minimizing Trust Leaks for Robust Sybil Detection »
János Höner · Shinichi Nakajima · Alexander Bauer · Klaus-robert Mueller · Nico Görnitz -
2017 Talk: Minimizing Trust Leaks for Robust Sybil Detection »
János Höner · Shinichi Nakajima · Alexander Bauer · Klaus-robert Mueller · Nico Görnitz