Timezone: »

Voice2Series: Reprogramming Acoustic Models for Time Series Classification
Huck Yang · Yun-Yun Tsai · Pin-Yu Chen

Tue Jul 20 06:20 AM -- 06:25 AM (PDT) @

Learning to classify time series with limited data is a practical yet challenging problem. Current methods are primarily based on hand-designed feature extraction rules or domain-specific data augmentation. Motivated by the advances in deep speech processing models and the fact that voice data are univariate temporal signals, in this paper we propose Voice2Serie (V2S), a novel end-to-end approach that reprograms acoustic models for time series classification, through input transformation learning and output label mapping. Leveraging the representation learning power of a large-scale pre-trained speech processing model, on 31 different time series tasks we show that V2S outperforms or is on part with state-of-the-art methods on 22 tasks, and improves their average accuracy by 1.72%. We further provide theoretical justification of V2S by proving its population risk is upper bounded by the source risk and a Wasserstein distance accounting for feature alignment via reprogramming. Our results offer new and effective means to time series classification.

Author Information

Huck Yang (Georgia Tech)

C.-H. Huck Yang is a 4th-year Ph.D. student at Georgia Institute of Technology working on robust and privacy-preserving speech recognition and sequence modeling. (ICML 21, ICASSP 20 & 21, InterSpeech 20 & 21, and More) Previously, he worked at Amazon Alexa Speech 2020 and 2021, Hitachi Central Lab in 2019, EPFL in 2018, KAUST and TSMC in 2017, and received Wallace H. Coulter Fellowship in 2017. His advisor is Prof. Chin-Hui Lee, IEEE Fellow, and ISCA Fellow. He is on the job market for an academic or industry position in 2022.

Yun-Yun Tsai (Columbia University)
Pin-Yu Chen (IBM Research AI)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors