Measuring Meta-Cultural Competency: A Spectral Framework for LLM Knowledge Structures
Abstract
Most existing cultural evaluation frameworks for large language models (LLMs) focus on matching model outputs to ground-truth answers, primarily measuring factual cultural awareness. This overlooks whether models internalize broader cultural structure and pluralism. We introduce a spectral-analysis-based framework that captures large-scale macrostructural patterns in models' cultural knowledge and evaluate eight LLMs across nine cultural domains spanning all five of Newmark's cultural dimensions and 170 countries. Comparing with human data, we find that instruction-tuned models align more closely with human cultural structure than older models, while increased model size does not consistently improve performance. Finally, simulation-based experiments show that our proposed spectral metric better predicts a model's ability to serve users from unfamiliar cultural backgrounds than existing ones.