Morning Poster
in
Workshop: Artificial Intelligence & Human Computer Interaction
ConceptEvo: Interpreting Concept Evolution in Deep Learning Training
Haekyu Park · Seongmin Lee · Benjamin Hoover · Austin Wright · Omar Shaikh · Rahul Duggal · Nilaksh Das · Kevin Li · Judy Hoffman · Polo Chau
We present ConceptEvo, a unified interpretation framework for deep neural networks (DNNs) that reveals the inception and evolution of learned concepts during training. Our work fills a critical gap in DNN interpretation research, as existing methods focus on post-hoc interpretation after training. ConceptEvo presents two novel technical contributions: (1) an algorithm that generates a unified semantic space that enables side-by-side comparison of different models during training; and (2) an algorithm that discovers and quantifies important concept evolutions for class predictions. Through a large-scale human evaluation with 260 participants and quantitative experiments, we show that CONCEPTEVO discovers evolutions across different models that are meaningful to humans and important for predictions. ConceptEvo works for both modern (ConvNeXt) and classic DNNs (e.g., VGGs, InceptionV3).