Skip to yearly menu bar Skip to main content


Contributed Talk
in
Workshop: Negative Dependence: Theory and Applications in Machine Learning

Towards Efficient Evaluation of Risk via Herding

Zelai Xu

[ ]
2019 Contributed Talk

Abstract: We introduce a novel use of herding to address the problem of selecting samples from a large unlabeled dataset to efficiently evaluate the risk of a given model. Herding is an algorithm which elaborately draws samples to approximate the underlying distribution. We use herding to select the most informative samples and show that the loss evaluated on $k$ samples produced by herding converges to the expected loss at a rate $\mathcal{O}(1/k)$, which is much faster than $\mathcal{O}(1/\sqrt{k})$ for iid random sampling. We validate our analysis on both synthetic data and real data, and further explore the empirical performance of herding-based sampling in different cases of high-dimensional data.

Chat is not available.