Test-of-Time Award

The Test-of-Time award celebrates the lasting impact a paper has had over the past ten years. This year, we have considered all the papers that were presented at ICML 2013, and among those papers, have selected three papers that have been well cited and well followed up by the machine learning community since then. These three papers cover diverse aspects of machine learning, including unsupervised representation learning, hyperparameter tuning (model selection), and learning beyond average risk minimization. After a discussion among the program chairs, we are ready to announce this year (2023)’s test-of-time award. The award goes to


  • Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, Cynthia Dwork. Learning Fair Representations. Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):325-333, 2013.

This paper played an influential role in creating what is now a well established subfield of machine learning– ML and fairness, introducing a broader ML community into various notions of fairness, including group and individual fairness.  This area is becoming increasingly more important now as advanced ML systems, often referred to as generative AI, are being deployed widely in the society, impacting the society in unforeseen ways and in potentially biased and unfair ways. This paper shaped a core aspect of the research problems those in our community tackle, and their views on how to assess the effectiveness of the tools they develop. 


In the rest of this announcement, we why we believe these three papers (the award paper and two runners up) deserve such recognition. There were two runners up, both of which we believe have left a lasting impact in their own ways. 


  • Galen Andrew, Raman Arora, Jeff Bilmes, Karen Livescu. Deep Canonical Correlation Analysis.  Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):1247-1255, 2013.

This paper has two significance. First, this paper proposed a principled approach to multimodal representation learning with a deep neural network. Such deep multimodal learning has become one of the highlights of recent advances in so-called generative artificial intelligence, highlighted by DaLLE, Stable Diffusion, etc. Second, this paper showed that it is possible to obtain good representation of data in an unsupervised way without relying on reconstruction of observations but by relating multiple views of the same observation within the representation space. This has inspired numerous recent algorithms on reconstruction-free self-supervised representation learning algorithms, such as contrastive learning. 

  • James Bergstra, Daniel Yamins, David Cox. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. Proceedings of the 30th International Conference on Machine Learning, PMLR 28(1):115-123, 2013.

Despite its enormous importance and impact on the practice of machine learning, hyperparameter tuning had often been considered more as after-thought or as art rather than science. In this paper, Bergstra et al. demonstrated that the advances in black-box optimization and our understanding of machine learning finally allow us to treat hyperparameter tuning as a rigorous scientific and engineering problem that can be approached in a systematic manner rather than in a heuristic manner. Their experiments on automatic hyperparameter tuning of hundreds of hyperparameters was one of the most convincing demonstrations in favour of algorithmic hyperparameter tuning.