Timezone: »

Spotlight
Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification
Nan Lu · Shida Lei · Gang Niu · Issei Sato · Masashi Sugiyama

Thu Jul 22 06:25 AM -- 06:30 AM (PDT) @
To cope with high annotation costs, training a classifier only from weakly supervised data has attracted a great deal of attention these days. Among various approaches, strengthening supervision from completely unsupervised classification is a promising direction, which typically employs class priors as the only supervision and trains a binary classifier from unlabeled (U) datasets. While existing risk-consistent methods are theoretically grounded with high flexibility, they can learn only from two U sets. In this paper, we propose a new approach for binary classification from $m$ U-sets for $m\ge2$. Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC), which is aimed at predicting from which U set each observed sample is drawn. SSC can be solved by a standard (multi-class) classification method, and we use the SSC solution to obtain the final binary classifier through a certain linear-fractional transformation. We built our method in a flexible and efficient end-to-end deep learning framework and prove it to be classifier-consistent. Through experiments, we demonstrate the superiority of our proposed method over state-of-the-art methods.

#### Author Information

##### Nan Lu (The University of Tokyo/RIKEN)

Nan Lu is a Ph.D. student at the Department of Complexity Science and Engineering, the University of Tokyo. Her research interests lie in the fields of weakly supervised learning, learning with real-world constraints, and deep learning.

##### Gang Niu (RIKEN)

Gang Niu is currently an indefinite-term research scientist at RIKEN Center for Advanced Intelligence Project.