HiBias: A Nine-Faceted Bias Annotation Dataset for Media Bias Detection in Hindi
Abstract
Understanding different forms of bias in news remains largely unexplored with respect to Indian languages, such as, Hindi. However, development of automated systems to detect and mitigate media bias in Hindi requires availability of extensive and exhaustive datasets that covers multiple facets of bias. In this paper, we introduce the first annotated dataset consisting of 300 unique articles from two leading Indian news media agencies in Hindi language. Our annotations include 9 different types of bias, that ensures exhaustiveness and we specifically request the annotators to provide the rationale behind their respective bias detection. These rationales along with the dataset will aid in development of trustworthy and explainable automated systems for bias detec tion. The dataset and replication code are publicly available at https://zenodo.org/records/18253271 and https://github.com/KushalTrivedi19032005/Indic Media-Bias-Detection-Dataset.