Timezone: »
Poster
Robustness in Multimodal Learning under Train-Test Modality Mismatch
Brandon McKinzie · Vaishaal Shankar · Joseph Cheng · Yinfei Yang · Jonathon Shlens · Alexander Toshev
Multimodal learning is defined as learning over multiple heterogeneous input modalities such as video, audio, and text. In this work, we are concerned with understanding how models behave as the type of modalities differ between training and deployment, a situation that naturally arises in many applications of multimodal learning to hardware platforms. We present a multimodal robustness framework to provide a systematic analysis of common multimodal representation learning methods. Further, we identify robustness short-comings of these approaches and propose two intervention techniques leading to $1.5\times$-$4\times$ robustness improvements on three datasets, AudioSet, Kinetics-400 and ImageNet-Captions. Finally, we demonstrate that these interventions better utilize additional modalities, if present, to achieve competitive results of $44.2$ mAP on AudioSet 20K.
Author Information
Brandon McKinzie (Apple)
Vaishaal Shankar (Amazon)
Joseph Cheng (Humane)
Yinfei Yang (Apple)
Jonathon Shlens (Google)
Alexander Toshev (Apple ML Research)
More from the Same Authors
-
2022 Poster: Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP) »
Alex Fang · Gabriel Ilharco · Mitchell Wortsman · Yuhao Wan · Vaishaal Shankar · Achal Dave · Ludwig Schmidt -
2022 Spotlight: Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP) »
Alex Fang · Gabriel Ilharco · Mitchell Wortsman · Yuhao Wan · Vaishaal Shankar · Achal Dave · Ludwig Schmidt -
2021 Poster: Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization »
John Miller · Rohan Taori · Aditi Raghunathan · Shiori Sagawa · Pang Wei Koh · Vaishaal Shankar · Percy Liang · Yair Carmon · Ludwig Schmidt -
2021 Spotlight: Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization »
John Miller · Rohan Taori · Aditi Raghunathan · Shiori Sagawa · Pang Wei Koh · Vaishaal Shankar · Percy Liang · Yair Carmon · Ludwig Schmidt -
2020 Poster: Neural Kernels Without Tangents »
Vaishaal Shankar · Alex Fang · Wenshuo Guo · Sara Fridovich-Keil · Jonathan Ragan-Kelley · Ludwig Schmidt · Benjamin Recht -
2020 Poster: Evaluating Machine Accuracy on ImageNet »
Vaishaal Shankar · Rebecca Roelofs · Horia Mania · Alex Fang · Benjamin Recht · Ludwig Schmidt -
2019 Poster: Do ImageNet Classifiers Generalize to ImageNet? »
Benjamin Recht · Rebecca Roelofs · Ludwig Schmidt · Vaishaal Shankar -
2019 Oral: Do ImageNet Classifiers Generalize to ImageNet? »
Benjamin Recht · Rebecca Roelofs · Ludwig Schmidt · Vaishaal Shankar