Efficient Rashomon Set Approximation for Decision Trees
Abstract
Standard machine learning pipelines often admit many near-optimal models. These “Rashomon sets” pose both challenges and opportunities for trustworthy decision making: they expose model multiplicity and enable users to search for accurate models that also satisfy secondary desiderata such as fairness, robustness, feature-use constraints, or domain-specific requirements. Rather than committing to a single empirical-risk minimizer, the Rashomon-set paradigm lets practitioners choose among a set of good models using criteria that may be difficult to encode directly in an objective. However, computation of Rashomon sets, even for simple, interpretable model classes such as sparse decision trees, continues to require immense memory and runtime resources. We present PRAXIS, an algorithm to approximate this Rashomon set with orders-of-magnitude improvement in runtime and memory usage. We validate that PRAXIS regularly recovers almost all of the full Rashomon set. PRAXIS allows researchers and practitioners to scalably model the Rashomon set for real-world datasets.