Skip to yearly menu bar Skip to main content


Contributed Talk
in
Workshop: 2nd Workshop on Formal Verification of Machine Learning

Outstanding Paper: Interpreting Robustness Proofs of Deep Neural Networks - Debangshu Banerjee, Avaljot Singh, Gagandeep Singh

Debangshu Banerjee


Abstract:

Numerous methods have emerged to verify the robustness of deep neural networks (DNNs). While effective in providing theoretical guarantees, the proofs generated using these techniques often lack human interpretability. Our paper bridges this gap by introducing new concepts, algorithms, and representations that generate human-understandable interpretations of the proofs. Using our approach, we discover that standard DNN proofs rely more on irrelevant input features compared to provably robust DNNs. Provably robust DNNs filter out spurious input features, but sometimes it comes at the cost of semantically meaningful ones. DNNs combining adversarial and provably robust training strike a balance between the two. Overall, our work enhances human comprehension of proofs and sheds light on their reliance on different types of input features.

Chat is not available.