Particle Flow for Learning from Label Proportions
Abstract
This work proposes a novel method for solving learning from label proportion problems. For this purpose, we learn a classifier that minimizes three key objectives: (i) a bag-level loss, which quantifies the discrepancy between true and predicted label proportions in bags, (ii) an instance-level loss, inspired from domain adaptation, which leverages anchor samples with known labels and trainable supports and (iii) a distribution discrepancy that aims at aligning anchor's learned support with those of the bag samples. The problem is formulated as an alternating optimization process, iteratively updating the classifier and aligning distributions via a particle flow method. The flow of anchor samples is governed by a vector field designed to minimize the anchor loss while ensuring alignment between anchor and bag distributions. We provide a theoretical analysis, guaranteeing the convergence of the flow and identifying conditions under which the method achieves effective alignment. Our analysis highlights that gap and diversity in label proportions within bags is a critical factor for learnability. Empirical results on tabular and image datasets demonstrate the method's effectiveness, outperforming state-of-the-art approaches.