Causes and Consequences of Representational Similarity in Machine Learning Models
Abstract
Numerous works have noted similarities in how machine learning models represent the world, even across modalities. Although much effort has been devoted to uncovering properties and metrics on which these models align, surprisingly little work has explored causes of this similarity. To advance this line of inquiry, this work explores how two factors—dataset overlap and task overlap—influence downstream model similarity. We evaluate the effects of both factors through experiments across model sizes and modalities, from small classifiers to large language models. We find that generally, both task and dataset overlap cause higher representational similarity. Finally, we consider downstream consequences of representational similarity, demonstrating how greater similarity increases vulnerability to transferable adversarial attacks.