Timezone: »

Differentiable Top-k Classification Learning
Felix Petersen · Hilde Kuehne · Christian Borgelt · Oliver Deussen

Wed Jul 20 03:30 PM -- 05:30 PM (PDT) @ Hall E #328

The top-k classification accuracy is one of the core metrics in machine learning. Here, k is conventionally a positive integer, such as 1 or 5, leading to top-1 or top-5 training objectives. In this work, we relax this assumption and optimize the model for multiple k simultaneously instead of using a single k. Leveraging recent advances in differentiable sorting and ranking, we propose a family of differentiable top-k cross-entropy classification losses. This allows training while not only considering the top-1 prediction, but also, e.g., the top-2 and top-5 predictions. We evaluate the proposed losses for fine-tuning on state-of-the-art architectures, as well as for training from scratch. We find that relaxing k not only produces better top-5 accuracies, but also leads to top-1 accuracy improvements. When fine-tuning publicly available ImageNet models, we achieve a new state-of-the-art for these models.

Author Information

Felix Petersen (University of Konstanz)

Felix Petersen is a researcher and Ph.D. student in the Department of Computer Science at the University of Konstanz. His main research interests are investigating neural networks and combining them with differentiable algorithms, e.g., for solving unsupervised inverse problems. By the age of 19, Felix was the youngest student who obtained the degree of Bachelor of Computer Science one year after starting his Ph.D. research. In 2019, he was awarded the Konrad-Zuse-youth-price for extraordinary work in the domain of artificial intelligence. In the past, Felix has worked, i.a., at TAU, DESY, PSI, and CERN.

Hilde Kuehne (University of Frankfurt)
Christian Borgelt (University of Salzburg)
Oliver Deussen (University of Konstanz)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors