We study an online and streaming unsupervised classification system. Our setting consists of a collection of classifiers (with unknown confusion matrices) each of which can classify one sample per unit time, and which are accessed by a stream of unlabeled samples. Each sample is dispatched to one or more classifiers, and depending on the labels collected from these classifiers, may be sent to other classifiers to collect additional labels. The labels are continually aggregated. Once the aggregated label has high enough accuracy (a pre-specified threshold for accuracy) or the sample is sent to all the classifiers, the now labeled sample is ejected from the system. For any given pre-specified threshold for accuracy, the objective is to sustain the maximum possible rate of arrival of new samples, such that the number of samples in memory does not grow unbounded. In this paper, we characterize the Pareto-optimal region of accuracy and arrival rate, and develop an algorithm that can operate at any point within this region. Our algorithm uses queueing-based routing and scheduling approaches combined with novel online tensor decomposition method to learn the hidden parameters, to Pareto-optimality guarantees. We finally verify our theoretical results through simulations on various synthetic and real ensembles, where our real ensembles are formed using deep image classifiers, e.g. AlexNet, VGG, and ResNet.
Soumya Basu (University of Texas at Austin)
Steven Gutstein (ARL)
Brent Lance (Army Research Laboratory )
Sanjay Shakkottai (University of Texas at Austin)
Related Events (a corresponding poster, oral, or spotlight)
2019 Poster: Pareto Optimal Streaming Unsupervised Classification »
Thu Jun 13th 06:30 -- 09:00 PM Room Pacific Ballroom