Spotlight
Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more
Elad Tolochinksy · Ibrahim Jubran · Dan Feldman
Ballroom 3 & 4
Abstract:
Coreset (or core-set) is a small weighted \emph{subset} of an input set with respect to a given \emph{monotonic} function that \emph{provably} approximates its fitting loss to \emph{any} given . Using we can obtain an approximation of that minimizes this loss, by running \emph{existing} optimization algorithms on . In this work we provide: (i) A lower bound which proves that there are sets with no coresets smaller than for general monotonic loss functions. (ii) A proof that, with an additional common regularization term and under a natural assumption that holds e.g. for logistic regression and the sigmoid activation functions, a small coreset exists for \emph{any} input . (iii) A generic coreset construction algorithm that computes such a small coreset in time, and (iv) Experimental results with open-source code which demonstrate that our coresets are effective and are much smaller in practice than predicted in theory.
Chat is not available.