Timezone: »
Feature crossing is a popular method for augmenting the feature set of a machine learning model by taking the Cartesian product of a small number of existing categorical features. While feature crosses have traditionally been hand-picked by domain experts, a recent line of work has focused on the automatic discovery of informative feature crosses. Our work proposes a simple yet efficient and effective approach to this problem using tensor proxies as well as a novel application of the attention mechanism to convert the combinatorial problem of feature cross search to a continuous optimization problem. By solving the continuous optimization problem and then rounding the solution to a feature cross, we give a highly efficient algorithm for feature cross search that trains only a single model for feature cross searching, unlike prior greedy methods that require training a large number of models. Through extensive empirical evaluations, we show that our algorithm is not only efficient, but also discovers more informative feature crosses that allow us to achieve state-of-the-art empirical results for feature cross models. Furthermore, even without the rounding step, we obtain a novel DNN architecture for augmenting existing models with a small number of features to improve quality without introducing any feature crosses. This avoids the cost of storing additional large embedding tables for these feature crosses.
Author Information
Taisuke Yasuda (School of Computer Science, Carnegie Mellon University)
Mohammad Hossein Bateni (Google Research)
Lin Chen (Yale University)
Matthew Fahrbach (Google Research)
Thomas Fu (Google Research)
More from the Same Authors
-
2023 : Preference Elicitation for Music Recommendations »
Ofer Meshi · Jon Feldman · Li Yang · Ben Scheetz · Yanli Cai · Mohammad Hossein Bateni · Corbyn Salisbury · Vikram Aggarwal · Craig Boutilier -
2023 : Tackling Provably Hard Representative Selection viaGraph Neural Networks »
Mehran Kazemi · Anton Tsitsulin · Hossein Esfandiari · Mohammad Hossein Bateni · Deepak Ramachandran · Bryan Perozzi · Vahab Mirrokni -
2023 : Sequential Attention for Feature Selection »
Taisuke Yasuda · Mohammad Hossein Bateni · Lin Chen · Matthew Fahrbach · Thomas Fu · Vahab Mirrokni -
2023 Poster: Sharper Bounds for $\ell_p$ Sensitivity Sampling »
David Woodruff · Taisuke Yasuda -
2023 Poster: Learning Rate Schedules in the Presence of Distribution Shift »
Matthew Fahrbach · Adel Javanmard · Vahab Mirrokni · Pratik Worah -
2023 Oral: Sharper Bounds for $\ell_p$ Sensitivity Sampling »
David Woodruff · Taisuke Yasuda -
2023 Poster: Approximately Optimal Core Shapes for Tensor Decompositions »
Mehrdad Ghadiri · Matthew Fahrbach · Thomas Fu · Vahab Mirrokni -
2020 Poster: More Data Can Expand The Generalization Gap Between Adversarially Robust and Standard Models »
Lin Chen · Yifei Min · Mingrui Zhang · Amin Karbasi -
2019 Poster: Categorical Feature Compression via Submodular Optimization »
Mohammad Hossein Bateni · Lin Chen · Hossein Esfandiari · Thomas Fu · Vahab Mirrokni · Afshin Rostamizadeh -
2019 Oral: Categorical Feature Compression via Submodular Optimization »
Mohammad Hossein Bateni · Lin Chen · Hossein Esfandiari · Thomas Fu · Vahab Mirrokni · Afshin Rostamizadeh -
2019 Poster: Distributed Weighted Matching via Randomized Composable Coresets »
Sepehr Assadi · Mohammad Hossein Bateni · Vahab Mirrokni -
2019 Oral: Distributed Weighted Matching via Randomized Composable Coresets »
Sepehr Assadi · Mohammad Hossein Bateni · Vahab Mirrokni