Timezone: »
Does the dominant approach to learn representations (as a side effect of optimizing an expected cost for a single training distribution) remain a good approach when we are dealing with multiple distributions? Our thesis is that such scenarios are better served by representations that are richer than those obtained with a single optimization episode. We support this thesis with simple theoretical arguments and with experiments utilizing an apparently näive ensembling technique: concatenating the representations obtained from multiple training episodes using the same data, model, algorithm, and hyper-parameters, but different random seeds. These independently trained networks perform similarly. Yet, in a number of scenarios involving new distributions, the concatenated representation performs substantially better than an equivalently sized network trained with a single training run. This proves that the representations constructed by multiple training episodes are in fact different. Although their concatenation carries little additional information about the training task under the training distribution, it becomes substantially more informative when tasks or distributions change. Meanwhile, a single training episode is unlikely to yield such a redundant representation because the optimization process has no reason to accumulate features that do not incrementally improve the training performance.
Author Information
Jianyu Zhang (New York University)
Leon Bottou (Meta AI)
More from the Same Authors
-
2023 Poster: Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization »
Alexandre Rame · Kartik Ahuja · Jianyu Zhang · Matthieu Cord · Leon Bottou · David Lopez-Paz -
2022 : Discussion Panel »
Percy Liang · Léon Bottou · Jayashree Kalpathy-Cramer · Alex Smola -
2022 Poster: Rich Feature Construction for the Optimization-Generalization Dilemma »
Jianyu Zhang · David Lopez-Paz · Léon Bottou -
2022 Spotlight: Rich Feature Construction for the Optimization-Generalization Dilemma »
Jianyu Zhang · David Lopez-Paz · Léon Bottou -
2019 Poster: First-Order Adversarial Vulnerability of Neural Networks and Input Dimension »
Carl-Johann Simon-Gabriel · Yann Ollivier · Leon Bottou · Bernhard Schölkopf · David Lopez-Paz -
2019 Poster: AdaGrad stepsizes: sharp convergence over nonconvex landscapes »
Rachel Ward · Xiaoxia Wu · Leon Bottou -
2019 Oral: AdaGrad stepsizes: sharp convergence over nonconvex landscapes »
Rachel Ward · Xiaoxia Wu · Leon Bottou -
2019 Oral: First-Order Adversarial Vulnerability of Neural Networks and Input Dimension »
Carl-Johann Simon-Gabriel · Yann Ollivier · Leon Bottou · Bernhard Schölkopf · David Lopez-Paz -
2017 Poster: Wasserstein Generative Adversarial Networks »
Martin Arjovsky · Soumith Chintala · Léon Bottou -
2017 Talk: Wasserstein Generative Adversarial Networks »
Martin Arjovsky · Soumith Chintala · Léon Bottou