Timezone: »

Supervised-Contrastive Loss Learns Orthogonal Frames and Batching Matters
Ganesh Ramachandra Kini · Vala Vakilian · Tina Behnia · Jaidev Gill · Christos Thrampoulidis

We are motivated by the question of what differences in the learning process occur when optimizing the supervised contrastive loss (SCL) and the cross-entropy (CE) loss. Our main finding is that the geometry of feature-embeddings learned by SCL forms an orthogonal frame (OF) regardless of the number of training examples per class. This is in contrast to the CE loss, for which previous work has shown that it learns embeddings geometries that are highly dependent on the class sizes. We arrive at our finding theoretically, by proving that the global minimizers of an unconstrained features model with SCL loss and entry-wise non-negativity constraints form an OF. We then validate the model's prediction by conducting experiments with standard deep-learning models on benchmark vision datasets. Finally, our analysis and experiments reveal that the batching scheme chosen during SCL training can play a critical role in speeding-up convergence to the OF geometry.

Author Information

Ganesh Ramachandra Kini (University of California, Santa Barbara)
Vala Vakilian (University of British Columbia)
Tina Behnia (University of British Columbia)
Jaidev Gill (University of British Columbia)
Christos Thrampoulidis (University of British Columbia)

More from the Same Authors