ICML Poster Does a Neural Network Really Encode Symbolic Concepts?

Poster

Does a Neural Network Really Encode Symbolic Concepts?

Mingjie Li · Quanshi Zhang

Exhibit Hall 1 #543

[ Abstract ]

[ PDF] [ Poster]

Abstract:

Recently, a series of studies have tried to extract interactions between input variables modeled by a DNN and define such interactions as concepts encoded by the DNN. However, strictly speaking, there still lacks a solid guarantee whether such interactions indeed represent meaningful concepts. Therefore, in this paper, we examine the trustworthiness of interaction concepts from four perspectives. Extensive empirical studies have verified that a well-trained DNN usually encodes sparse, transferable, and discriminative concepts, which is partially aligned with human intuition. The code is released at https://github.com/sjtu-xai-lab/interaction-concept.

Chat is not available.