Efficient Distributionally Robust Assortment Optimization in MNL Bandits
Yunfan Zhang ⋅ Yuxuan Han ⋅ Zhengyuan Zhou
Abstract
We investigate the distributionally robust assortment optimization (DRAO) problem under the contextual multinomial logit (MNL) choice model, where the decision-maker seeks to maximize revenue against worst-case distributional deviations. To address potential distribution shifts relative to the observed data environment, we study DRAO under ambiguity sets defined by three divergences: total variation (TV), Kullback–Leibler (KL), and chi-square ($\chi^2$). Incorporating robust concerns poses challenges for both algorithm design and theoretical analysis. By leveraging strong duality results from the distributionally robust optimization literature and integrating them into the assortment optimization procedures, we develop tailored polynomial-time algorithms under each divergence. We further provide a theoretical analysis and establish sample complexity bounds for all three robust formulations.
Successful Page Load