Timezone: »
The goal in label-imbalanced and group-sensitive classification is to optimize metrics such as balanced error and equal opportunity. Classical methods like re-weighted cross-entropy, are known to fail when used with the modern practice of training deep nets to the terminal phase of training (TPT), that is training beyond zero training error. In contrast to previous heuristics, we follow a principled analysis explaining how different loss adjustments affect margins. First, we prove that for linear classifiers trained in TPT, it is necessary to introduce multiplicative, rather than additive, logit adjustments so that the relative margins between classes change appropriately. To show this, we discover a connection of the multiplicative CE modification to the cost-sensitive support-vector machines. While additive adjustments are ineffective deep in the TPT, we show numerically that they can speed up convergence by countering an initial negative effect of the multiplicative weights. Motivated by these findings, we formulate the vector-scaling (VS) loss, that captures existing techniques as special cases. For Gaussian-mixtures data, we perform a generalization analysis, revealing tradeoffs between balanced / standard error and equal opportunity.
Author Information
Ganesh Ramachandra Kini (University of California, Santa Barbara)
Orestis Paraskevas (University of California, Santa Barbara)
Samet Oymak (University of California, Riverside)
Christos Thrampoulidis (University of British Columbia)
More from the Same Authors
-
2021 : Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation »
Ke Wang · Vidya Muthukumar · Christos Thrampoulidis -
2021 : Binary Classification of Gaussian Mixtures: Abundance of Support Vectors, Benign Overfitting and Regularization »
Ke Wang · Christos Thrampoulidis -
2021 : Identification and Adaptive Control of Markov Jump Systems: Sample Complexity and Regret Bounds »
Yahya Sattar · Zhe Du · Davoud Ataee Tarzanagh · Necmiye Ozay · Laura Balzano · Samet Oymak -
2021 : Non-Stationary Representation Learning in Sequential Multi-Armed Bandits »
Qin Yuzhen · Tommaso Menara · Samet Oymak · ShiNung Ching · Fabio Pasqualetti -
2023 : Generalization and Stability of Interpolating Neural Networks with Minimal Width »
Hossein Taheri · Christos Thrampoulidis -
2023 : Supervised-Contrastive Loss Learns Orthogonal Frames and Batching Matters »
Ganesh Ramachandra Kini · Vala Vakilian · Tina Behnia · Jaidev Gill · Christos Thrampoulidis -
2023 : Fast Test Error Rates for Gradient-based Algorithms on Separable Data »
Puneesh Deora · Bhavya Vasudeva · Vatsal Sharan · Christos Thrampoulidis -
2023 : On the Training and Generalization Dynamics of Multi-head Attention »
Puneesh Deora · Rouzbeh Ghaderi · Hossein Taheri · Christos Thrampoulidis -
2023 Poster: On the Role of Attention in Prompt-tuning »
Samet Oymak · Ankit Singh Rawat · Mahdi Soltanolkotabi · Christos Thrampoulidis -
2022 Poster: FedNest: Federated Bilevel, Minimax, and Compositional Optimization »
Davoud Ataee Tarzanagh · Mingchen Li · Christos Thrampoulidis · Samet Oymak -
2022 Oral: FedNest: Federated Bilevel, Minimax, and Compositional Optimization »
Davoud Ataee Tarzanagh · Mingchen Li · Christos Thrampoulidis · Samet Oymak -
2021 Poster: Safe Reinforcement Learning with Linear Function Approximation »
Sanae Amani · Christos Thrampoulidis · Lin Yang -
2021 Spotlight: Safe Reinforcement Learning with Linear Function Approximation »
Sanae Amani · Christos Thrampoulidis · Lin Yang