We consider classification in the presence of class-dependent asymmetric label noise with unknown noise probabilities. In this setting, identifiability conditions are known, but additional assumptions were shown to be required for finite sample rates, and so far only the parametric rate has been obtained. Assuming these identifiability conditions, together with a measure-smoothness condition on the regression function and Tsybakov’s margin condition, we show that the Robust kNN classifier of Gao et al. attains, the mini-max optimal rates of the noise-free setting, up to a log factor, even when trained on data with unknown asymmetric label noise. Hence, our results provide a solid theoretical backing for this empirically successful algorithm. By contrast the standard kNN is not even consistent in the setting of asymmetric label noise. A key idea in our analysis is a simple kNN based method for estimating the maximum of a function that requires far less assumptions than existing mode estimators do, and which may be of independent interest for noise proportion estimation and randomised optimisation problems.
Henry Reeve (University of Birmingham)
I am a postdoctoral research fellow in Machine Learning at the University of Birmingham. I am working on the FORGING EPSRC research project led by Professor Ata Kabán. The goal of the project is both to explore geometric structures which enable efficient learning from small data samples in high dimensional learning scenarios. We are particularly focused on the role played by Random Projections of the data onto a low dimensional subspace. Within the scope of this project I am focusing on a variety of problems including learning with label noise, mixture proportion estimation, learning in unbounded domains and matrix factorisation. I did a PhD in the School of Computer Science at the University of Manchester under the supervision of Professor Gavin Brown. My research focus was on learning scenarios with asymmetric costs in high dimensions with connections to Neyman Pearson classification and multi armed bandits. Areas of interest: Minimax rates, Label noise, Mixture proportion estimation, Weakly supervised learning, Random projections, Compressive learning, Matrix factorisation, Dimensionality reduction, Multi armed bandits.
Ata Kaban (University of Birmingham)
Related Events (a corresponding poster, oral, or spotlight)
2019 Oral: Fast Rates for a kNN Classifier Robust to Unknown Asymmetric Label Noise »
Tue Jun 11th 12:05 -- 12:10 PM Room Room 103