Poster
in
Workshop: Challenges in Deployable Generative AI
AutoBiasTest: Controllable Test Sentence Generation for Open-Ended Social Bias Testing in Language Models at Scale
Rafal Kocielnik · Shrimai Prabhumoye · Vivian Zhang · R. Alvarez · Anima Anandkumar
Keywords: [ social bias testing ] [ Generative AI ] [ social bias ] [ Language Models ]
Social bias in Pretrained Language Models (PLMs) affects text generation and other downstream NLP tasks. Existing bias testing methods rely predominantly on manual templates or on expensive crowd-sourced data. We propose a novel AutoBiasTest method that automatically generates controlled sentences for testing bias in PLMs, hence providing a flexible and low-cost alternative. Our approach uses another PLM for generation controlled by conditioning on social group and attribute terms. We show that generated sentences are natural and similar to human-produced content in terms of word length and diversity. We find that our bias scores are well correlated with manual templates, but AutoBiasTest highlights biases not captured by these templates due to more diverse and realistic contexts. By automating large-scale test sentence generation, we enable better estimation of underlying bias distributions.