ICML Poster DNA: Domain Generalization with Diversified Neural Averaging

Poster

DNA: Domain Generalization with Diversified Neural Averaging

Xu Chu · Yujie Jin · Wenwu Zhu · Yasha Wang · Xin Wang · Shanghang Zhang · Hong Mei

Hall E #100

Keywords: [ DL: Algorithms ] [ MISC: Transfer, Multitask and Meta-learning ] [ APP: Computer Vision ]

[ Abstract ]

[ Poster] [ Paper PDF]

Abstract: The inaccessibility of the target domain data causes domain generalization (DG) methods prone to forget target discriminative features, and challenges the pervasive theme in existing literature in pursuing a single classifier with an ideal joint risk. In contrast, this paper investigates model misspecification and attempts to bridge DG with classifier ensemble theoretically and methodologically. By introducing a pruned Jensen-Shannon (PJS) loss, we show that the target square-root risk w.r.t. the PJS loss of the

ρ

$\rho$ -ensemble (the averaged classifier weighted by a quasi-posterior

ρ

$\rho$ ) is bounded by the averaged source square-root risk of the Gibbs classifiers. We derive a tighter bound by enforcing a positive principled diversity measure of the classifiers. We give a PAC-Bayes upper bound on the target square-root risk of the

ρ

$\rho$ -ensemble. Methodologically, we propose a diversified neural averaging (DNA) method for DG, which optimizes the proposed PAC-Bayes bound approximately. The DNA method samples Gibbs classifiers transversely and longitudinally by simultaneously considering the dropout variational family and optimization trajectory. The

ρ

$\rho$ -ensemble is approximated by averaging the longitudinal weights in a single run with dropout shut down, ensuring a fast ensemble with low computational overhead. Empirically, the proposed DNA method achieves the state-of-the-art classification performance on standard DG benchmark datasets.

Chat is not available.