Timezone: »

 
How much Data is Augmentation Worth?
Jonas Geiping · Gowthami Somepalli · Ravid Shwartz-Ziv · Andrew Wilson · Tom Goldstein · Micah Goldblum
Event URL: https://openreview.net/forum?id=TMs-EYm4_ms »

Despite the clear performance benefits of data augmentations, little is known about why they are so effective. In this paper, we disentangle several key mechanisms through which data augmentations operate. Establishing an \textit{exchange rate} between augmented and additional real data, we find that augmentations can provide nearly the same performance gains as additional data samples for in-domain generalization and even greater performance gains for out-of-distribution test sets. We also find that neural networks with hard-coded invariances underperform those with invariances learned via data augmentations. Our experiments suggest that these benefits to generalization arise from the additional stochasticity conferred by randomized augmentations, leading to flatter minima.

Author Information

Jonas Geiping (University of Maryland, College Park)
Gowthami Somepalli (University of Maryland, College Park)
Ravid Shwartz-Ziv (New York University)
Andrew Wilson (New York University)
Tom Goldstein (University of Maryland)
Micah Goldblum (New York University)

More from the Same Authors