Beyond Literal Translation: Evaluating Cultural Effectiveness in Social Media UGC
Abstract
Social media platforms enable large-scale cross-lingual communication, yet translating user-generated content (UGC) remains challenging due to its informal style, culture-laden expressions, and interaction-driven nuances. While recent LLMs have advanced translation quality, existing benchmarks and metrics often overlook whether translations preserve intended meaning and cultural resonance in real-world contexts. In this work, we introduce CULTURE-MT, a benchmark for social media translation that explicitly emphasizes CULtural Transmission and UGC-specific emotion REsonance. CULTURE-MT comprises 1,002 Chinese-to-English UGC notes spanning 14 domains, systematically categorized into four types based on culture-loaded symbols and linguistic styles. We also construct UGC-oriented training data to fine-tune Qwen3-8B and Qwen3-32B as strong baselines. We propose the cultural effectiveness criterion and train a related JUDGER model that jointly assesses expression accuracy and cultural adaptability. Evaluating 15 models, we find that standard automatic metrics are largely insensitive to cultural effectiveness. Our work establishes a comprehensive framework for evaluating and advancing UGC translation, and will provide an open evaluation platform to support future research in culturally effective UGC translation.