Timezone: »

 
Sequential Attention for Feature Selection
Taisuke Yasuda · Mohammad Hossein Bateni · Lin Chen · Matthew Fahrbach · Thomas Fu · Vahab Mirrokni
Event URL: https://openreview.net/forum?id=Pn7CPyHmrr »
Feature selection is the problem of selecting a subset of features for a machine learning modelthat maximizes model quality subject to a budget constraint.For neural networks, prior methods, including those based on $\ell_1$ regularization, attention, and other techniques,typically select the entire feature subset in one evaluation round,ignoring the residual value of features during selection,i.e., the marginal contribution of a feature given that other features have already been selected.We propose a feature selection algorithm called Sequential Attention that achieves state-of-the-art empirical resultsfor neural networks.This algorithm is based on an efficient one-pass implementation of greedy forward selectionand uses attention weights at each step as a proxy for feature importance.We give theoretical insights into our algorithm for linear regressionby showing that an adaptation to this setting is equivalent to theclassical Orthogonal Matching Pursuit (OMP) algorithm,and thus inherits all of its provable guarantees.Our theoretical and empirical analyses offer new explanations towards the effectiveness of attentionand its connections to overparameterization, which may be of independent interest.

Author Information

Taisuke Yasuda (School of Computer Science, Carnegie Mellon University)
Mohammad Hossein Bateni (Google Research)
Lin Chen (Yale University)
Matthew Fahrbach (Google Research)
Thomas Fu (Google Research)
Vahab Mirrokni (Google Research)

More from the Same Authors