MiniMax Learning of Interpretable Factored Stochastic Policies from Conjoint Data, with Uncertainty Quantification
Connor T T Jerzak ⋅ Priyanshi Chandra ⋅ Rishi Hazra
Abstract
We study offline learning of factored stochastic policies over extremely large, combinatorial action spaces and show how standard conjoint data can be used to estimate such policies with asymptotically valid uncertainty under conditions. Conjoint analyses typically report AMCEs by averaging over opponent attributes and thus ignore strategic interdependence. We instead learn \emph{stochastic interventions}---product-of-Categorical policies over factor levels—that (i) optimize expected outcomes in an average-case setting and (ii) extend to a two-player \emph{minimax} (adversarial) setting that realistically captures simultaneous strategic candidate selection. Methodologically, we derive a closed-form solution for the average-case optimizer under two-way interactions with $L_2$ variance regularization, and provide a general gradient-based procedure for richer model classes. Uncertainty from the outcome model propagates asymptotically to both the optimal policy and its value via a Delta method approximation. We further model institutional details (e.g., primaries) inside the minimax objective and introduce a data-driven measure of strategic divergence between parties. On synthetic data, we empirically characterize finite-sample error and coverage as dimensionality and $n$ vary. On a U.S. presidential conjoint, adversarially learned policies produce restricted-equilibrium vote shares that align with historical election ranges in our data, in stark contrast to non-adversarial (averaging) optimizers.
Successful Page Load