CSOR: Coreset Selection for Object Re-identification via Class Pruning
Abstract
Coreset Selection (CS) aims to extract a small yet representative subset from a large dataset, reducing the complexity of model training. Although CS has been primarily investigated for classification tasks, it is still underexplored for object Re-identification (ReID). In this paper, we first formulate Coreset Selection for Object Re-identification (CSOR) as a joint optimization problem to find both the optimal coreset and the optimal class subset. We identify intra-class diversity as a key factor for effective coreset construction for ReID. Based on this insight, we propose a novel two-stage framework, consisting of Diversity-driven Class Pruning (DCP) and Coverage-Prioritized Sampling (CPS), to address the unique challenges of ReID datasets. First, classes with low feature diversity are pruned to allocate the storage budget to the remaining informative classes. Then, samples are greedily selected in an easy-to-hard class order to maximize feature coverage within each class. Extensive experiments on three person ReID datasets and one vehicle ReID dataset demonstrate that our method consistently outperforms existing CS approaches, establishing a new state-of-the-art in CSOR.