Poster
in
Workshop: 2nd ICML Workshop on New Frontiers in Adversarial Machine Learning

Transferable Adversarial Perturbations between Self-Supervised Speech Recognition Models

Raphaël Olivier · Hadi Abdullah · Bhiksha Raj

Keywords: self-supervised learning transferability Adversarial Attacks Speech Recognition black-box

Project Page [ OpenReview]

Abstract

A targeted adversarial attack produces audio samples that can force an Automatic Speech Recognition (ASR) system to output attacker-chosen text. To exploit ASR models in real-world, black-box settings, an adversary can leverage the \textit{transferability} property, i.e. that an adversarial sample produced for a proxy ASR can also fool a different remote ASR. Recent work has shown that transferability against large ASR models is extremely difficult. In this work, we show that modern ASR architectures, specifically ones based on Self-Supervised Learning, are uniquely affected by transferability. We successfully demonstrate this phenomenon by evaluating state-of-the-art self-supervised ASR models like Wav2Vec2, HuBERT, Data2Vec and WavLM. We show that with relatively low-level additive noise achieving a 30dB Signal-Noise Ratio, we can achieve target transferability with up to 80\% accuracy. We then use an ablation study to show that Self-Supervised learning is a major cause of that phenomenon. Our results present a dual interest: they show that modern ASR architectures are uniquely vulnerable to adversarial security threats, and they help understanding the specificities of SSL training paradigms.

Chat is not available.