Skip to yearly menu bar Skip to main content

Workshop: Spurious correlations, Invariance, and Stability (SCIS)

Selection Bias Induced Spurious Correlations in Large Language Models

Emily McMilin

Keywords: [ Spurious Correlations ] [ large language models ] [ Causal Inference ]


In this work we explore the role of dataset selection bias in inducing and amplifying spurious correlations in large language models (LLMs). To highlight known discrepancies in gender representation between what exists in society and what is recorded in datasets, we developed a gender pronoun prediction task. We demonstrate and explain a dose-response relationship in the magnitude of the correlation between gender pronoun prediction and a variety of seemingly gender neutral variables like date and location on pre-trained (unmodified) BERT, DistilBERT, and XLM-RoBERTa models. We also fine-tune several models with the gender pronoun prediction task to further highlight the spurious correlation mechanism, and make an argument about its generalizability to far more datasets. Finally, we provide an online demo, inviting readers to experiment with their own interventions.

Chat is not available.