Does AI Assistance Preserve or Collapse Disagreement? A Study of Pre-Annotations in Ambiguous Video Labeling
Juan Gutiérrez ⋅ Víctor Gutiérrez-García ⋅ Jose L Blanco-Murillo
Abstract
AI-generated Pre-Annotations can accelerate video labeling, but they may also anchor annotators to model priors and suppress disagreement that is valuable for pluralistic dataset construction. We study this tradeoff in ambiguous temporal video annotation, where annotators choose event boundaries and assign context-dependent labels such as "normal" or "abnormal." We introduce a controlled audit protocol that separates annotation cost, consensus alignment, inter-annotator consistency, temporal-boundary variation, semantic-label variation, latent-space standardization, and edit behavior. In a counterbalanced pilot study with 18 annotators and 180 annotation sessions, a fixed CLIP-based Pre-Annotation engine reduced mean annotation time by 23.11\%; 72% of annotators were faster with assistance, with a median per-annotator gain of 35\%. Assistance increased inter-annotator consistency and CLIP-space standardization while maintaining comparable alignment with a human consensus diagnostic (AMI $\approx 0.64$ in both conditions). These findings suggest that Pre-Annotations acted mainly as boundary-standardization scaffolds in this setting, without an observed large aggregate shift away from human consensus. We contribute an audit framework and anonymized interaction traces for studying when AI-assisted annotation preserves, reshapes, or collapses human disagreement.
Successful Page Load