Aggregate Models, Not Explanations: Improving Feature Importance Estimation
Abstract
Feature-importance methods show promise for transforming machine learning (ML) models from predictive engines into tools for scientific discovery. However, expressive models can be unstable due to data sampling and algorithmic stochasticity, leading to inaccurate variable importance estimates, undermining their utility in critical biomedical applications. While ensembling offers a remedy, the choice between explaining a single ensemble model or aggregating individual model explanations is non-trivial due to the non-linearity of importance measures, and remains largely understudied. Our theoretical analysis, developed under assumptions accommodating complex state-of-the-art ML models, reveals that this choice is governed by a trade-off involving the model's excess risk. In contrast to prior literature, we show that ensembling at the model level provides more accurate variable-importance estimates, particularly for expressive models, by reducing this leading error term. We validate these findings on classical benchmarks and a large-scale proteomic study from the UK Biobank.