Timezone: »

Rehashing Kernel Evaluation in High Dimensions
Paris Siminelakis · Kexin Rong · Peter Bailis · Moses Charikar · Philip Levis

Tue Jun 11 02:40 PM -- 03:00 PM (PDT) @ Room 101

Kernel methods are effective but do not scale well to large scale data: a larger training set improves accuracy but incurs a quadratic increase in overall evaluation time. This is especially true in high dimensions where the geometric data structures used to accelerate kernel evaluation suffer from the curse of dimensionality. Recent theoretical advances have proposed fast kernel evaluation algorithms leveraging hashing techniques with worst-case asymptotic improvements. However, these advances are largely confined to the theoretical realm due to concerns such as super-linear preprocessing time and diminishing gains in non-worst case datasets. In this paper, we close the gap between theory and practice by addressing these challenges via provable and practical procedures for adaptive sample size selection, preprocessing time reduction, and new refined data-dependent variance bounds that quantify the performance of random sampling and hashing-based kernel evaluation methods on a given dataset. Our experiments show that these new tools offer up to 10x improvement in evaluation time on a range of synthetic and real world datasets.

Author Information

Paris Siminelakis (Stanford University)
Kexin Rong (Stanford University)
Peter Bailis (Stanford University)
Moses Charikar (Stanford University)
Philip Levis (Stanford University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors