Timezone: »

Feature Selection using Stochastic Gates
Yutaro Yamada · Ofir Lindenbaum · Sahand Negahban · Yuval Kluger

Wed Jul 15 11:00 AM -- 11:45 AM & Wed Jul 15 11:00 PM -- 11:45 PM (PDT) @
Feature selection problems have been extensively studied in the setting of linear estimation (e.g. LASSO), but less emphasis has been placed on feature selection for non-linear functions. In this study, we propose a method for feature selection in neural network estimation problems. The new procedure is based on probabilistic relaxation of the $\ell_0$ norm of features, or the count of the number of selected features. Our $\ell_0$-based regularization relies on a continuous relaxation of the Bernoulli distribution; such relaxation allows our model to learn the parameters of the approximate Bernoulli distributions via gradient descent. The proposed framework simultaneously learns either a nonlinear regression or classification function while selecting a small subset of features. We provide an information-theoretic justification for incorporating Bernoulli distribution into feature selection. Furthermore, we evaluate our method using synthetic and real-life data to demonstrate that our approach outperforms other commonly used methods in both predictive performance and feature selection.

Author Information

Yutaro Yamada (Yale University)
Ofir Lindenbaum (Yale)
Sahand Negahban (YALE)
Yuval Kluger (Yale School of Medicine)

More from the Same Authors