Skip to yearly menu bar Skip to main content


Simultaneous Similarity-based Self-Distillation for Deep Metric Learning

Karsten Roth · Timo Milbich · Bjorn Ommer · Joseph Paul Cohen · Marzyeh Ghassemi

Keywords: [ Algorithms ] [ Metric Learning ] [ Fairness, Accountability, and Transparency ]


Deep Metric Learning (DML) provides a crucial tool for visual similarity and zero-shot retrieval applications by learning generalizing embedding spaces, although recent work in DML has shown strong performance saturation across training objectives. However, generalization capacity is known to scale with the embedding space dimensionality. Unfortunately, high dimensional embeddings also create higher retrieval cost for downstream applications. To remedy this, we propose S2SD - Simultaneous Similarity-based Self-distillation. S2SD extends DML with knowledge distillation from auxiliary, high-dimensional embedding and feature spaces to leverage complementary context during training while retaining test-time cost and with negligible changes to the training time. Experiments and ablations across different objectives and standard benchmarks show S2SD offering highly significant improvements of up to 7% in Recall@1, while also setting a new state-of-the-art.

Chat is not available.