Timezone: »
We address the problem of finding influential training samples for a particular case of tree ensemble-based models, e.g., Random Forest (RF) or Gradient Boosted Decision Trees (GBDT). A natural way of formalizing this problem is studying how the model's predictions change upon leave-one-out retraining, leaving out each individual training sample. Recent work has shown that, for parametric models, this analysis can be conducted in a computationally efficient way. We propose several ways of extending this framework to non-parametric GBDT ensembles under the assumption that tree structures remain fixed. Furthermore, we introduce a general scheme of obtaining further approximations to our method that balance the trade-off between performance and computational complexity. We evaluate our approaches on various experimental setups and use-case scenarios and demonstrate both the quality of our approach to finding influential training samples in comparison to the baselines and its computational efficiency.
Author Information
Boris Sharchilev (Yandex)
Yury Ustinovskiy (Princeton University)
Pavel Serdyukov (Yandex)
Maarten de Rijke (University of Amsterdam)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: Finding Influential Training Samples for Gradient Boosted Decision Trees »
Thu. Jul 12th 12:10 -- 12:20 PM Room A6
More from the Same Authors
-
2021 : How Not to Measure Disentanglement »
· Julia Kiseleva · Maarten de Rijke -
2021 : Counterfactual Explanations for Graph Neural Networks »
Ana Lucic · Maartje ter Hoeve · Gabriele Tolomei · Maarten de Rijke · Fabrizio Silvestri -
2021 : Flexible Interpretability through Optimizable Counterfactual Explanations for Tree Ensembles »
Ana Lucic · Harrie Oosterhuis · Hinda Haned · Maarten de Rijke -
2021 : CF-GNNExplainer: Counterfactual Explanations for Graph Neural Networks »
Ana Lucic · Maartje ter Hoeve · Gabriele Tolomei · Maarten de Rijke · Fabrizio Silvestri -
2019 Poster: Learning to select for a predefined ranking »
Aleksei Ustimenko · Aleksandr Vorobev · Gleb Gusev · Pavel Serdyukov -
2019 Oral: Learning to select for a predefined ranking »
Aleksei Ustimenko · Aleksandr Vorobev · Gleb Gusev · Pavel Serdyukov