Timezone: »
Popular word embedding algorithms exhibit stereotypical biases, such as gender bias. The widespread use of these algorithms in machine learning systems can amplify stereotypes in important contexts. Although some methods have been developed to mitigate this problem, how word embedding biases arise during training is poorly understood. In this work we develop a technique to address this question. Given a word embedding, our method reveals how perturbing the training corpus would affect the resulting embedding bias. By tracing the origins of word embedding bias back to the original training documents, one can identify subsets of documents whose removal would most reduce bias. We demonstrate our methodology on Wikipedia and New York Times corpora, and find it to be very accurate.
Author Information
Marc-Etienne Brunet (University of Toronto)
Colleen Alkalay-Houlihan (University of Toronto)
Ashton Anderson (University of Toronto)
Richard Zemel (Vector Institute)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Understanding the Origins of Bias in Word Embeddings »
Thu. Jun 13th 01:30 -- 04:00 AM Room Pacific Ballroom #146
More from the Same Authors
-
2021 : Online Algorithmic Recourse by Collective Action »
Elliot Creager · Richard Zemel -
2022 : Towards Environment-Invariant Representation Learning for Robust Task Transfer »
Benjamin Eyre · Richard Zemel · Elliot Creager -
2023 : Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift »
Benjamin Eyre · Elliot Creager · David Madras · Vardan Papyan · Richard Zemel -
2023 Test Of Time: Learning Fair Representations »
Richard Zemel · Yu Wu · Kevin Swersky · Toniann Pitassi · Cynthia Dwork -
2022 : Invited talks 3, Q/A, Amy, Rich and Liting »
Liting Sun · Amy Zhang · Richard Zemel -
2022 : Invited talks 3, Amy Zhang, Rich Zemel and Liting Sun »
Amy Zhang · Richard Zemel · Liting Sun -
2021 Poster: SketchEmbedNet: Learning Novel Concepts by Imitating Drawings »
Alexander Wang · Mengye Ren · Richard Zemel -
2021 Poster: Learning a Universal Template for Few-shot Dataset Generalization »
Eleni Triantafillou · Hugo Larochelle · Richard Zemel · Vincent Dumoulin -
2021 Poster: Environment Inference for Invariant Learning »
Elliot Creager · Joern-Henrik Jacobsen · Richard Zemel -
2021 Spotlight: Learning a Universal Template for Few-shot Dataset Generalization »
Eleni Triantafillou · Hugo Larochelle · Richard Zemel · Vincent Dumoulin -
2021 Spotlight: Environment Inference for Invariant Learning »
Elliot Creager · Joern-Henrik Jacobsen · Richard Zemel -
2021 Spotlight: SketchEmbedNet: Learning Novel Concepts by Imitating Drawings »
Alexander Wang · Mengye Ren · Richard Zemel -
2021 Poster: On Monotonic Linear Interpolation of Neural Network Parameters »
James Lucas · Juhan Bae · Michael Zhang · Stanislav Fort · Richard Zemel · Roger Grosse -
2021 Spotlight: On Monotonic Linear Interpolation of Neural Network Parameters »
James Lucas · Juhan Bae · Michael Zhang · Stanislav Fort · Richard Zemel · Roger Grosse -
2020 : Invited Talk 4: Prof. Richard Zemel from University of Toronto »
Richard Zemel -
2020 Workshop: Participatory Approaches to Machine Learning »
Angela Zhou · David Madras · Deborah Raji · Smitha Milli · Bogdan Kulynych · Richard Zemel -
2020 Poster: Causal Modeling for Fairness In Dynamical Systems »
Elliot Creager · David Madras · Toniann Pitassi · Richard Zemel -
2020 Poster: Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach »
Martin Mladenov · Elliot Creager · Omer Ben-Porat · Kevin Swersky · Richard Zemel · Craig Boutilier -
2020 Poster: Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling »
Will Grathwohl · Kuan-Chieh Wang · Joern-Henrik Jacobsen · David Duvenaud · Richard Zemel -
2019 Workshop: Learning and Reasoning with Graph-Structured Representations »
Ethan Fetaya · Zhiting Hu · Thomas Kipf · Yujia Li · Xiaodan Liang · Renjie Liao · Raquel Urtasun · Hao Wang · Max Welling · Eric Xing · Richard Zemel -
2019 Poster: Lorentzian Distance Learning for Hyperbolic Representations »
Marc Law · Renjie Liao · Jake Snell · Richard Zemel -
2019 Poster: Flexibly Fair Representation Learning by Disentanglement »
Elliot Creager · David Madras · Joern-Henrik Jacobsen · Marissa Weis · Kevin Swersky · Toniann Pitassi · Richard Zemel -
2019 Oral: Lorentzian Distance Learning for Hyperbolic Representations »
Marc Law · Renjie Liao · Jake Snell · Richard Zemel -
2019 Oral: Flexibly Fair Representation Learning by Disentanglement »
Elliot Creager · David Madras · Joern-Henrik Jacobsen · Marissa Weis · Kevin Swersky · Toniann Pitassi · Richard Zemel -
2018 Poster: Learning Adversarially Fair and Transferable Representations »
David Madras · Elliot Creager · Toniann Pitassi · Richard Zemel -
2018 Oral: Learning Adversarially Fair and Transferable Representations »
David Madras · Elliot Creager · Toniann Pitassi · Richard Zemel -
2018 Poster: Reviving and Improving Recurrent Back-Propagation »
Renjie Liao · Yuwen Xiong · Ethan Fetaya · Lisa Zhang · Kijung Yoon · Zachary S Pitkow · Raquel Urtasun · Richard Zemel -
2018 Poster: Distilling the Posterior in Bayesian Neural Networks »
Kuan-Chieh Wang · Paul Vicol · James Lucas · Li Gu · Roger Grosse · Richard Zemel -
2018 Oral: Distilling the Posterior in Bayesian Neural Networks »
Kuan-Chieh Wang · Paul Vicol · James Lucas · Li Gu · Roger Grosse · Richard Zemel -
2018 Oral: Reviving and Improving Recurrent Back-Propagation »
Renjie Liao · Yuwen Xiong · Ethan Fetaya · Lisa Zhang · Kijung Yoon · Zachary S Pitkow · Raquel Urtasun · Richard Zemel -
2018 Poster: Neural Relational Inference for Interacting Systems »
Thomas Kipf · Ethan Fetaya · Kuan-Chieh Wang · Max Welling · Richard Zemel -
2018 Oral: Neural Relational Inference for Interacting Systems »
Thomas Kipf · Ethan Fetaya · Kuan-Chieh Wang · Max Welling · Richard Zemel -
2017 Poster: Deep Spectral Clustering Learning »
Marc Law · Raquel Urtasun · Richard Zemel -
2017 Talk: Deep Spectral Clustering Learning »
Marc Law · Raquel Urtasun · Richard Zemel