Timezone: »
Online platforms regularly conduct randomized experiments to understand how changes to the platform causally affect various outcomes of interest. However, experimentation on online platforms has been criticized for having, among other issues, a lack of meaningful oversight and user consent. As platforms give users greater agency, it becomes possible to conduct observational studies in which users self-select into the treatment of interest as an alternative to experiments in which the platform controls whether the user receives treatment or not.
In this paper, we conduct four large-scale within-study comparisons on Twitter aimed at assessing the effectiveness of observational studies derived from user self-selection on online platforms. In a within-study comparison, treatment effects from an observational study are assessed based on how effectively they replicate results from a randomized experiment with the same target population. We test the naive difference in group means estimator, exact matching, regression adjustment, and propensity score weighting while controlling for plausible confounding variables.
In all cases, all observational estimates perform poorly at recovering the ground-truth estimate from the analogous randomized experiments. Our results suggest that observational studies derived from user self-selection are a poor alternative to randomized experimentation on online platforms. In discussing our results, we present a “Catch-22” that undermines the use of causal inference in these settings: we give users control because we postulate that there is no adequate model for predicting user behavior, but performing observational causal inference successfully requires exactly that.
Author Information
Smitha Milli (UC Berkeley)
Luca Belli (Twitter)
Moritz Hardt (University of California, Berkeley)
More from the Same Authors
-
2021 Poster: Alternative Microfoundations for Strategic Classification »
Meena Jagadeesan · Celestine Mendler-Dünner · Moritz Hardt -
2021 Spotlight: Alternative Microfoundations for Strategic Classification »
Meena Jagadeesan · Celestine Mendler-Dünner · Moritz Hardt -
2020 Poster: Performative Prediction »
Juan Perdomo · Tijana Zrnic · Celestine Mendler-Dünner · Moritz Hardt -
2020 Poster: Strategic Classification is Causal Modeling in Disguise »
John Miller · Smitha Milli · Moritz Hardt -
2020 Poster: Test-Time Training with Self-Supervision for Generalization under Distribution Shifts »
Yu Sun · Xiaolong Wang · Zhuang Liu · John Miller · Alexei Efros · Moritz Hardt -
2020 Poster: Balancing Competing Objectives with Noisy Data: Score-Based Classifiers for Welfare-Aware Machine Learning »
Esther Rolf · Max Simchowitz · Sarah Dean · Lydia T. Liu · Daniel Bjorkegren · Moritz Hardt · Joshua Blumenstock -
2019 Poster: Natural Analysts in Adaptive Data Analysis »
Tijana Zrnic · Moritz Hardt -
2019 Poster: The Implicit Fairness Criterion of Unconstrained Learning »
Lydia T. Liu · Max Simchowitz · Moritz Hardt -
2019 Oral: The Implicit Fairness Criterion of Unconstrained Learning »
Lydia T. Liu · Max Simchowitz · Moritz Hardt -
2019 Oral: Natural Analysts in Adaptive Data Analysis »
Tijana Zrnic · Moritz Hardt -
2018 Poster: Delayed Impact of Fair Machine Learning »
Lydia T. Liu · Sarah Dean · Esther Rolf · Max Simchowitz · Moritz Hardt -
2018 Oral: Delayed Impact of Fair Machine Learning »
Lydia T. Liu · Sarah Dean · Esther Rolf · Max Simchowitz · Moritz Hardt