Poster
 in 
Workshop: “Could it have been different?” Counterfactuals in Minds and Machines
                        
                    
                    Counterfactual Learning to Rank via Knowledge Distillation
Ehsan Ebrahimzadeh · Alex Cozzi · Abraham Bagherjeiran
Knowledge distillation is a transfer learning technique to improve the performance of a student model trained on a Distilled Empirical Risk, formed via a label distribution defined by some teacher model, which is typically trained on the same task and belongs to a hypothesis class with richer representational capacity.In this work, we study knowledge distillation in the context of counterfactual Learning To Rank(LTR) from implicit user feedback.We consider a generic partial information search ranking scenario, where the relevancy of the items in the logged search context is observed only in the event of an explicit user engagement.The premise of using knowledge distillation in this counterfactual setup is to leverage teacher's distilled knowledge in the form of soft predicted relevance labels to help the student with more effective list-wise comparisons, variance reduction, and improved generalization behavior. We build empirical risk estimates that rely not only on the de-biased observed user feedback via standard Inverse Propensity Weighting, but also on the teacher's distilled knowledge via potential outcome modeling.Our distillation-based counterfactual LTR framework offers a new perspective on how explanatory click models, trained for a click prediction task with privileged encoding of the confounding search context, can explain away the effect of presentation-related confounding for the student model that is trained for a ranking task.We analyze the generalization performance of the proposed empirical risk estimators from a theoretical perspective by establishing bounds on their estimation error. We also conduct rigorous counterfactual offline evaluations as well as online controlled randomized experiments for a search ranking task in a major E-commerce platform. We report strong empirical results that the distilled knowledge from a teacher trained on expert judgments can significantly improve the generalization performance of the student ranker.