Timezone: »
Self-supervised contrastive learning has been recently shown very effective in preventing deep networks from overfitting noisy labels. Despite its empirical success, the theoretical understanding of the effect of contrastive learning on boosting robustness of deep networks is very limited. In this work, we show that contrastive learning provably boosts robustness of deep networks against noisy labels by providing an embedding matrix that has (i) a singular value corresponding to each subclass in the data, which is relatively larger than the sum of the remaining singular values of that subclass; and (ii) a large alignment between the largest singular vector and the clean labels of that subclass. The above properties allow a linear layer trained on the embeddings to learn the clean labels quickly, and prevent it from overfitting the noisy labels for a large number of training iterations.We further show that the initial robustness provided by contrastive learning enables state-of-the-art robust methods to achieve a superior performance under extreme noise levels, e.g., 6.3\% increase in accuracy on CIFAR-10 with 40\% asymmetric noisy labels, and 14\% increase in accuracy on CIFAR100 with 80\% symmetric noisy labels.
Author Information
Yihao Xue (UCLA)
Kyle Whitecross (UCLA)
UCLA undergrad studying CS and ML intern at Kumo.ai.
Baharan Mirzasoleiman (Stanford University)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Guaranteed Robust Deep Learning against Extreme Label Noise using Self-supervised Learning »
Dates n/a. Room None
More from the Same Authors
-
2021 : CrossWalk: Fairness-enhanced Node Representation Learning »
Ahmad Khajehnejad · Moein Khajehnejad · Krishna Gummadi · Adrian Weller · Baharan Mirzasoleiman -
2022 Poster: Adaptive Second Order Coresets for Data-efficient Machine Learning »
Omead Pooladzandi · David Davini · Baharan Mirzasoleiman -
2022 Spotlight: Adaptive Second Order Coresets for Data-efficient Machine Learning »
Omead Pooladzandi · David Davini · Baharan Mirzasoleiman -
2022 Poster: Not All Poisons are Created Equal: Robust Training against Data Poisoning »
Yu Yang · Tian Yu Liu · Baharan Mirzasoleiman -
2022 Oral: Not All Poisons are Created Equal: Robust Training against Data Poisoning »
Yu Yang · Tian Yu Liu · Baharan Mirzasoleiman -
2022 : Investigating Why Contrastive Learning Benefits Robustness against Label Noise »
Yihao Xue · Kyle Whitecross · Baharan Mirzasoleiman -
2022 : Less Data Can Be More! »
Baharan Mirzasoleiman -
2022 : Not All Poisons are Created Equal: Robust Training against Data Poisoning »
Yu Yang · Baharan Mirzasoleiman -
2021 : Data-efficient and Robust Learning from Massive Datasets »
Baharan Mirzasoleiman