Finding Most Influential Sets
Lucas Konrad ⋅ Nikolas Kuschnig
Abstract
Identifying *most influential sets* (MIS) – size-$k$ subsets whose removal maximally changes a target estimand – is typically infeasible because it requires searching over $\binom{n}{k}$ subsets. We show that, for a broad class of estimands whose leave-set-out effect admits a linear-fractional form, the MIS problem reduces to a one-parameter sequence of top-$k$ selections. Using Dinkelbach's method, we obtain an efficient algorithm that runs in $O(n)$ per iteration and terminates in finitely many steps. We show that our approach returns globally optimal sets for univariate settings, such as average treatment effect estimation in randomized experiments. For partial linear models, we establish selection consistency under Neyman orthogonality and mild first-stage stability. We validate our method through simulations and real-world applications – recovering MIS that were previously computationally inaccessible.
Successful Page Load