Timezone: »
Datasets scraped from the internet have been critical to large-scale machine learning. Yet, its success puts the utility of future internet-derived datasets at potential risk, as model outputs begin to replace human annotations as a source of supervision. In this work, we formalize a system where interactions with one model are recorded as history and scraped as training data in the future. We then analyze its stability over time by tracking changes to a test-time bias statistic (e.g. gender bias of model predictions). We find that the degree of bias amplification is closely linked to whether the model's outputs behave like samples from the training distribution, a behavior which we characterize and define as uniform faithfulness. Experiments in three conditional prediction scenarios -- image classification, visual role-labeling, and language generation -- demonstrate that models that exhibit a sampling-like behavior are more faithful and thus more stable. Based on this insight, we propose an intervention to help mitigate and stabilize unstable feedback systems.
Author Information
Rohan Taori (Stanford University)
Tatsunori Hashimoto (Stanford)
Related Events (a corresponding poster, oral, or spotlight)
-
2023 Poster: Data Feedback Loops: Model-driven Amplification of Dataset Biases »
Thu. Jul 27th through Fri the 28th Room Exhibit Hall 1 #416
More from the Same Authors
-
2023 : Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks »
Daniel Kang · Xuechen Li · Ion Stoica · Carlos Guestrin · Matei Zaharia · Tatsunori Hashimoto -
2023 Poster: Coder Reviewer Reranking for Code Generation »
Tianyi Zhang · Tao Yu · Tatsunori Hashimoto · Mike Lewis · Scott Yih · Daniel Fried · Sida Wang -
2023 Poster: Whose Opinions Do Language Models Reflect? »
Shibani Santurkar · Esin Durmus · Faisal Ladhak · Cinoo Lee · Percy Liang · Tatsunori Hashimoto -
2023 Oral: Whose Opinions Do Language Models Reflect? »
Shibani Santurkar · Esin Durmus · Faisal Ladhak · Cinoo Lee · Percy Liang · Tatsunori Hashimoto -
2023 Oral: Evaluating Self-Supervised Learning via Risk Decomposition »
Yann Dubois · Tatsunori Hashimoto · Percy Liang -
2023 Poster: Evaluating Self-Supervised Learning via Risk Decomposition »
Yann Dubois · Tatsunori Hashimoto · Percy Liang -
2023 Poster: Out-of-Domain Robustness via Targeted Augmentations »
Irena Gao · Shiori Sagawa · Pang Wei Koh · Tatsunori Hashimoto · Percy Liang -
2022 Poster: Identifiability Conditions for Domain Adaptation »
Ishaan Gulrajani · Tatsunori Hashimoto -
2022 Spotlight: Identifiability Conditions for Domain Adaptation »
Ishaan Gulrajani · Tatsunori Hashimoto -
2021 Poster: Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization »
John Miller · Rohan Taori · Aditi Raghunathan · Shiori Sagawa · Pang Wei Koh · Vaishaal Shankar · Percy Liang · Yair Carmon · Ludwig Schmidt -
2021 Spotlight: Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization »
John Miller · Rohan Taori · Aditi Raghunathan · Shiori Sagawa · Pang Wei Koh · Vaishaal Shankar · Percy Liang · Yair Carmon · Ludwig Schmidt