Timezone: »
We consider multi-class classification where the predictor has a hierarchical structure that allows for a very large number of labels both at train and test time. The predictive power of such models can heavily depend on the structure of the tree, and although past work showed how to learn the tree structure, it expected that the feature vectors remained static. We provide a novel algorithm to simultaneously perform representation learning for the input data and learning of the hierarchical predictor. Our approach optimizes an objective function which favors balanced and easily-separable multi-way node partitions. We theoretically analyze this objective, showing that it gives rise to a boosting style property and a bound on classification error. We next show how to extend the algorithm to conditional density estimation. We empirically validate both variants of the algorithm on text classification and language modeling, respectively, and show that they compare favorably to common baselines in terms of accuracy and running time.
Author Information
Yacine Jernite (New York University)
Anna Choromanska (New York University)
David Sontag (Massachusetts Institute of Technology)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Poster: Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation »
Mon. Aug 7th 08:30 AM -- 12:00 PM Room Gallery #16
More from the Same Authors
-
2022 : Evaluating Robustness to Dataset Shift via Parametric Robustness Sets »
Michael Oberst · Nikolaj Thams · David Sontag -
2022 : Evaluating Robustness to Dataset Shift via Parametric Robustness Sets »
Nikolaj Thams · Michael Oberst · David Sontag -
2022 Poster: Sample Efficient Learning of Predictors that Complement Humans »
Mohammad-Amin Charusaie · Hussein Mozannar · David Sontag · Samira Samadi -
2022 Poster: Co-training Improves Prompt-based Learning for Large Language Models »
Hunter Lang · Monica Agrawal · Yoon Kim · David Sontag -
2022 Spotlight: Sample Efficient Learning of Predictors that Complement Humans »
Mohammad-Amin Charusaie · Hussein Mozannar · David Sontag · Samira Samadi -
2022 Spotlight: Co-training Improves Prompt-based Learning for Large Language Models »
Hunter Lang · Monica Agrawal · Yoon Kim · David Sontag -
2021 Poster: Neural Pharmacodynamic State Space Modeling »
Zeshan Hussain · Rahul G. Krishnan · David Sontag -
2021 Poster: Regularizing towards Causal Invariance: Linear Models with Proxies »
Michael Oberst · Nikolaj Thams · Jonas Peters · David Sontag -
2021 Poster: Graph Cuts Always Find a Global Optimum for Potts Models (With a Catch) »
Hunter Lang · David Sontag · Aravindan Vijayaraghavan -
2021 Spotlight: Regularizing towards Causal Invariance: Linear Models with Proxies »
Michael Oberst · Nikolaj Thams · Jonas Peters · David Sontag -
2021 Oral: Graph Cuts Always Find a Global Optimum for Potts Models (With a Catch) »
Hunter Lang · David Sontag · Aravindan Vijayaraghavan -
2021 Spotlight: Neural Pharmacodynamic State Space Modeling »
Zeshan Hussain · Rahul G. Krishnan · David Sontag -
2020 Poster: Estimation of Bounds on Potential Outcomes For Decision Making »
Maggie Makar · Fredrik Johansson · John Guttag · David Sontag -
2020 Poster: Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models »
Rares-Darius Buhai · Yoni Halpern · Yoon Kim · Andrej Risteski · David Sontag -
2020 Poster: Consistent Estimators for Learning to Defer to an Expert »
Hussein Mozannar · David Sontag -
2019 Poster: Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models »
Michael Oberst · David Sontag -
2019 Oral: Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models »
Michael Oberst · David Sontag -
2019 Poster: Beyond Backprop: Online Alternating Minimization with Auxiliary Variables »
Anna Choromanska · Benjamin Cowen · Sadhana Kumaravel · Ronny Luss · Mattia Rigotti · Irina Rish · Paolo DiAchille · Viatcheslav Gurev · Brian Kingsbury · Ravi Tejwani · Djallel Bouneffouf -
2019 Oral: Beyond Backprop: Online Alternating Minimization with Auxiliary Variables »
Anna Choromanska · Benjamin Cowen · Sadhana Kumaravel · Ronny Luss · Mattia Rigotti · Irina Rish · Paolo DiAchille · Viatcheslav Gurev · Brian Kingsbury · Ravi Tejwani · Djallel Bouneffouf -
2018 Poster: Semi-Amortized Variational Autoencoders »
Yoon Kim · Sam Wiseman · Andrew Miller · David Sontag · Alexander Rush -
2018 Oral: Semi-Amortized Variational Autoencoders »
Yoon Kim · Sam Wiseman · Andrew Miller · David Sontag · Alexander Rush -
2017 Poster: Estimating individual treatment effect: generalization bounds and algorithms »
Uri Shalit · Fredrik D Johansson · David Sontag -
2017 Talk: Estimating individual treatment effect: generalization bounds and algorithms »
Uri Shalit · Fredrik D Johansson · David Sontag