Timezone: »
Poster
Locally Adaptive Label Smoothing Improves Predictive Churn
Dara Bahri · Heinrich Jiang
Training modern neural networks is an inherently noisy process that can lead to high \emph{prediction churn}-- disagreements between re-trainings of the same model due to factors such as randomization in the parameter initialization and mini-batches-- even when the trained models all attain similar accuracies. Such prediction churn can be very undesirable in practice. In this paper, we present several baselines for reducing churn and show that training on soft labels obtained by adaptively smoothing each example's label based on the example's neighboring labels often outperforms the baselines on churn while improving accuracy on a variety of benchmark classification tasks and model architectures.
Author Information
Dara Bahri (Google Research)
Heinrich Jiang (Google Research)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Locally Adaptive Label Smoothing Improves Predictive Churn »
Fri. Jul 23rd 03:50 -- 03:55 AM Room
More from the Same Authors
-
2023 : Sharpness-Aware Minimization Leads to Low-Rank Features »
Maksym Andriushchenko · Dara Bahri · Hossein Mobahi · Nicolas Flammarion -
2022 : Confident Adaptive Language Modeling »
Tal Schuster · Adam Fisch · Jai Gupta · Mostafa Dehghani · Dara Bahri · Vinh Tran · Yi Tay · Don Metzler -
2021 Poster: Active Covering »
Heinrich Jiang · Afshin Rostamizadeh -
2021 Spotlight: Active Covering »
Heinrich Jiang · Afshin Rostamizadeh -
2021 Poster: OmniNet: Omnidirectional Representations from Transformers »
Yi Tay · Mostafa Dehghani · Vamsi Aribandi · Jai Gupta · Philip Pham · Zhen Qin · Dara Bahri · Da-Cheng Juan · Don Metzler -
2021 Poster: Synthesizer: Rethinking Self-Attention for Transformer Models »
Yi Tay · Dara Bahri · Don Metzler · Da-Cheng Juan · Zhe Zhao · Che Zheng -
2021 Spotlight: Synthesizer: Rethinking Self-Attention for Transformer Models »
Yi Tay · Dara Bahri · Don Metzler · Da-Cheng Juan · Zhe Zhao · Che Zheng -
2021 Oral: OmniNet: Omnidirectional Representations from Transformers »
Yi Tay · Mostafa Dehghani · Vamsi Aribandi · Jai Gupta · Philip Pham · Zhen Qin · Dara Bahri · Da-Cheng Juan · Don Metzler -
2020 Poster: Sparse Sinkhorn Attention »
Yi Tay · Dara Bahri · Liu Yang · Don Metzler · Da-Cheng Juan -
2020 Poster: Deep k-NN for Noisy Labels »
Dara Bahri · Heinrich Jiang · Maya Gupta -
2019 Poster: Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints »
Andrew Cotter · Maya Gupta · Heinrich Jiang · Nati Srebro · Karthik Sridharan · Serena Wang · Blake Woodworth · Seungil You -
2019 Poster: Shape Constraints for Set Functions »
Andrew Cotter · Maya Gupta · Heinrich Jiang · Erez Louidor · James Muller · Taman Narayan · Serena Wang · Tao Zhu -
2019 Oral: Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints »
Andrew Cotter · Maya Gupta · Heinrich Jiang · Nati Srebro · Karthik Sridharan · Serena Wang · Blake Woodworth · Seungil You -
2019 Oral: Shape Constraints for Set Functions »
Andrew Cotter · Maya Gupta · Heinrich Jiang · Erez Louidor · James Muller · Taman Narayan · Serena Wang · Tao Zhu