Skip to yearly menu bar Skip to main content


Poster

Feature Contamination: On the Feasibility of Learning Representations that Generalize Out-of-Distribution

Tianren Zhang · Chujie Zhao · Guanyu Chen · Yizhou Jiang · Feng Chen


Abstract:

Learning representations that generalize out-of-distribution (OOD) is critical for developing robust machine learning models. However, despite significant efforts in recent years, algorithmic advances in this direction have been limited, creating a gap between theory and practice. In this work, we seek to understand the fundamental difficulty of OOD generalization in deep learning. We first empirically show that perhaps surprisingly, even with complete prior knowledge of OOD generalizable representations in training, the learned network still underperforms OOD across a wide range of benchmarks. To explain this, we then formally study two-layer ReLU networks trained by stochastic gradient descent in a structured OOD generalization setting, unveiling an unexplored failure mode that we refer to as feature contamination. We show that this failure mode essentially stems from the inductive biases of non-linear neural networks and fundamentally differs from the prevailing narrative of spurious correlations. Our results provide new insights into OOD generalization and neural networks, suggest that OOD generalization in practice can deviate from existing models and explanations, and demonstrate the necessity of incorporating inductive bias into OOD generalization algorithms.

Live content is unavailable. Log in and register to view live content