Feature selection is the task of choosing a small set out of a given set offeatures that capture the relevant properties of the data. In the context ofsupervised classification problems the relevance is determined by the givenlabels on the training data. A good choice of features is a key for buildingcompact and accurate classifiers.
In this paper we introduce a margin based feature selection criterion andapply it to measure the quality of sets of features. Using margins we devisenovel selection algorithms for multi-class classification problems and providetheoretical generalization bound. We also study the well known Reliefalgorithm and show that it resembles a gradient ascent over our margincriterion. We apply our new algorithm to various datasets and show that ournew Simba algorithm, which directly optimizes the margin, outperforms Relief. |