Oral
in
Workshop: The Second Workshop on Spurious Correlations, Invariance and Stability
ModelDiff: A Framework for Comparing Learning Algorithms
Harshay Shah · Sung Min (Sam) Park · Andrew Ilyas · Aleksander Madry
Abstract:
We study the problem of (learning) algorithm comparison, where the goal is to find differences between models trained with two different learning algorithms. We begin by formalizing this goal as one of finding distinguishing feature transformations, i.e., input transformations that change the predictions of models trained with one learning algorithm but not the other. We then present ModelDiff, a method that leverages the datamodels framework (Ilyas et al., 2022) to compare learning algorithms based on how they use their training data. Finally, we use ModelDiff to demonstrate how training image classifiers with standard data augmentation can amplify reliance on specific instances of co-occurence and texture biases.
Chat is not available.