Poster

Improved OOD Generalization via Adversarial Training and Pretraing

Mingyang Yi · Lu Hou · Jiacheng Sun · Lifeng Shang · Xin Jiang · Qun Liu · Zhiming Ma

Keywords: Deep Learning Theory

2021 Poster

Paper PDF [ Paper ] [ Visit Poster at Spot B3 in Virtual World ] [ Visit Poster at Spot D3 in Virtual World ]

Abstract

Recently, learning a model that generalizes well on out-of-distribution (OOD) data has attracted great attention in the machine learning community. In this paper, after defining OOD generalization by Wasserstein distance, we theoretically justify that a model robust to input perturbation also generalizes well on OOD data. Inspired by previous findings that adversarial training helps improve robustness, we show that models trained by adversarial training have converged excess risk on OOD data. Besides, in the paradigm of pre-training then fine-tuning, we theoretically justify that the input perturbation robust model in the pre-training stage provides an initialization that generalizes well on downstream OOD data. Finally, various experiments conducted on image classification and natural language understanding tasks verify our theoretical findings.

Video

Chat is not available.