Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Spurious correlations, Invariance, and Stability (SCIS)

How much Data is Augmentation Worth?

Jonas Geiping · Gowthami Somepalli · Ravid Shwartz-Ziv · Andrew Wilson · Tom Goldstein · Micah Goldblum

Keywords: [ Data Augmentations ] [ invariance ] [ out-of-domain ] [ Stochasticity ] [ Flatness ] [ Neural Networks ]


Abstract:

Despite the clear performance benefits of data augmentations, little is known about why they are so effective. In this paper, we disentangle several key mechanisms through which data augmentations operate. Establishing an \textit{exchange rate} between augmented and additional real data, we find that augmentations can provide nearly the same performance gains as additional data samples for in-domain generalization and even greater performance gains for out-of-distribution test sets. We also find that neural networks with hard-coded invariances underperform those with invariances learned via data augmentations. Our experiments suggest that these benefits to generalization arise from the additional stochasticity conferred by randomized augmentations, leading to flatter minima.

Chat is not available.