ICML Bayesian decision analysis for collecting nearly-optimal subsets

Poster
in
Workshop: Subset Selection in Machine Learning: From Theory to Applications

Bayesian decision analysis for collecting nearly-optimal subsets

Daniel Kowal

[ Abstract ]

[ Visit Poster at Spot A6 in Virtual World ]

Abstract:

Subset selection is valuable for interpretable learning, scientific discovery, and data compression. However, classical subset selection is often eschewed due to instability and computational bottlenecks. We address these challenges using Bayesian decision analysis. For any Bayesian predictive model, we elicit an acceptable family of subsets that provide nearly-optimal linear prediction. The Bayesian model regularizes the linear coefficients and propagates uncertainty quantification for key predictive comparisons. The acceptable family spawns new variable importance metrics based on whether a variable appear in all, some, or no acceptable subsets. The proposed approach exhibits excellent prediction and stability for both simulated data (including p = 400 > n) and a large education dataset with highly correlated covariates.

Poster in Workshop: Subset Selection in Machine Learning: From Theory to Applications

Bayesian decision analysis for collecting nearly-optimal subsets

Daniel Kowal

Poster
in
Workshop: Subset Selection in Machine Learning: From Theory to Applications