TajweedWER: A Recitation-Aware Evaluation Metric for Automatic Speech Recognition in Quranic Arabic
Abstract
Automatic Speech Recognition (ASR) systems are increasingly deployed for Quranic recitation learning and verification, yet standard evaluation metrics such as Word Error Rate (WER) fail to capture the specific phonological requirements of Tajweed, the set of rules governing correct Quranic recitation. We introduce TajweedWER, a recitation-aware evaluation metric that decomposes ASR errors into a Base WER component measuring general transcription accuracy and a Tajweed Penalty Score (TPS) quantifying the additional error attributable to Tajweed diacritical complexity. Evaluating Whisper-small on the Mozilla Common Voice Yoruba and Quran Reciters datasets across 8 professional reciters on Surah Al-Fatiha, we document mean Surface WER of 100.8%, mean Base WER of 87.3%, and mean TPS of 13.6 percentage points. These results reveal that standard WER systematically underestimates ASR failure on Quranic text by obscuring the specific contribution of Tajweed diacritical complexity. TajweedWER provides the Muslim AI community with a principled evaluation framework for Quranic ASR development, and all code is released open source.