Poster Thu, Jul 9, 2026 • 10:30 AM – 12:15 PM KST Coex: HALL A

Around the World in Eighty Ratings? Quantifying the Salience of Geo-Cultural Values for Pluralistic Alignment

Arkadiy Saakyan ⋅ Charvi Rastogi ⋅ Lora Aroyo

Abstract

Safe global deployment of AI models requires alignment with pluralistic human values, yet in existing safety evaluation datasets the rater pools remain largely homogeneous along geo-cultural dimensions. Through a meta-analysis of existing safety datasets, we observe that the vast majority does not include any geo-cultural information, and the ones that do, lack a robust approach to collect and understand cultural differences in safety ratings. Using the Inglehart-Welzel dimensions of cross-cultural variation, we demonstrate via hierarchical linear modeling that geo-cultural values predict safety ratings significantly better than demographic factors alone ($p<0.05$ in $6$ datasets). Further, our analysis shows that several safety datasets contain at least 10\% of culturally-sensitive items, where lack of cultural representation in the rater pool would lead to a false negative in safety classification. Finally, we provide empirical evidence that fine-tuned LLMs can identify culturally sensitive items but are not reliable at emulating judgments of raters from diverse cultural backgrounds, underscoring the critical need for continuous geo-culturally stratified (pluralistic) safety evaluations.