Poster
in
Workshop: Next Generation of AI Safety
Generated Audio Detectors Are Not Robust in Real-World Conditions
Soumya Shaw · Ben Nassi · Lea Schönherr
Keywords: [ deepfake ] [ audio ] [ Generative AI ] [ detection ]
The advent of generative AI (genAI) has transformed the digital landscape, revolutionizing our daily lives and technologies. However, the misuse of genAI has raised significant ethical and trust issues. Although substantial focus has been placed on detecting visual content, the challenge of identifying fake audio has become increasingly relevant. While fake audio detection methods have been developed, the evaluation under real-world conditions mostly escapes scrutiny. In this paper, we examine the efficacy of state-of-the-art fake audio detection methods under real-world conditions. By analyzing typical audio alterations of transmission pipelines, we identify several vulnerabilities: (1) minimal changes such as sound level variations can bias detection performance, (2) inevitable physical effects such as background noise lead to classifier failures, (3) classifiers struggle to generalize across different datasets, and (4) network degradation affects the overall detection performance. Our results indicate that existing detectors have major issues in differentiating between real and fake audio in practical applications and that significant improvements are still necessary for reliable detection in real-world environments.