Fast Rates for a kNN Classifier Robust to Unknown Asymmetric Label Noise
Henry Reeve · Ata Kaban

Tue Jun 11th 12:05 -- 12:10 PM @ Room 103

We consider classification in the presence of class-dependent asymmetric label noise with unknown noise probabilities. In this setting, identifiability conditions are known, but additional assumptions were shown to be required for finite sample rates, and only the parametric rate has been obtained so far. Assuming these identifiability conditions, together with a measure-smoothness condition on the regression function and Tsybakov’s margin condition, we obtain, up to a log factor, the mini-max optimal rates of the noise-free setting. This rate is attained by a recently proposed modification of the kNN classifier whose analysis exists only under known noise probabilities. Hence, our results provide solid theoretical backing for this empirically successful algorithm. By contrast the standard kNN is not even consistent in the setting of asymmetric label noise. A key idea in our analysis is a simple kNN based function optimisation approach that requires far less assumptions than existing mode estimators do, and which may be of independent interest for noise proportion estimation and other randomised optimisation problems.

Author Information

Henry Reeve (University of Birmingham)

I am a postdoctoral research fellow in Machine Learning at the University of Birmingham. I am working on the FORGING EPSRC research project led by Professor Ata Kabán. The goal of the project is both to explore geometric structures which enable efficient learning from small data samples in high dimensional learning scenarios. We are particularly focused on the role played by Random Projections of the data onto a low dimensional subspace. Within the scope of this project I am focusing on a variety of problems including learning with label noise, mixture proportion estimation, learning in unbounded domains and matrix factorisation. I did a PhD in the School of Computer Science at the University of Manchester under the supervision of Professor Gavin Brown. My research focus was on learning scenarios with asymmetric costs in high dimensions with connections to Neyman Pearson classification and multi armed bandits. Areas of interest: Minimax rates, Label noise, Mixture proportion estimation, Weakly supervised learning, Random projections, Compressive learning, Matrix factorisation, Dimensionality reduction, Multi armed bandits.

Ata Kaban (University of Birmingham)

Related Events (a corresponding poster, oral, or spotlight)