Oral
Fast Rates for a kNN Classifier Robust to Unknown Asymmetric Label Noise
Henry Reeve · Ata Kaban
We consider classification in the presence of class-dependent asymmetric label noise with unknown noise probabilities. In this setting, identifiability conditions are known, but additional assumptions were shown to be required for finite sample rates, and only the parametric rate has been obtained so far. Assuming these identifiability conditions, together with a measure-smoothness condition on the regression function and Tsybakov’s margin condition, we obtain, up to a log factor, the mini-max optimal rates of the noise-free setting. This rate is attained by a recently proposed modification of the kNN classifier whose analysis exists only under known noise probabilities. Hence, our results provide solid theoretical backing for this empirically successful algorithm. By contrast the standard kNN is not even consistent in the setting of asymmetric label noise. A key idea in our analysis is a simple kNN based function optimisation approach that requires far less assumptions than existing mode estimators do, and which may be of independent interest for noise proportion estimation and other randomised optimisation problems.