Timezone: »
Poster
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Yi Ren · Xu Tan · Tao Qin · Sheng Zhao · Zhou Zhao · Tie-Yan Liu
Text to speech (TTS) and automatic speech recognition (ASR) are two dual tasks in speech processing and both achieve impressive performance thanks to the recent advance in deep learning and large amount of aligned speech and text data. However, the lack of aligned data poses a major practical problem for TTS and ASR on low-resource languages. In this paper, by leveraging the dual nature of the two tasks, we propose an almost unsupervised learning method that only leverages few hundreds of paired data and extra unpaired data for TTS and ASR. Our method consists of the following components: (1) denoising auto-encoder, which reconstructs speech and text sequences respectively to develop the capability of language modeling both in speech and text domain; (2) dual transformation, where the TTS model transforms the text $y$ into speech $\hat{x}$, and the ASR model leverages the transformed pair $(\hat{x},y)$ for training, and vice versa, to boost the accuracy of the two tasks; (3) bidirectional sequence modeling, which address the error propagation problem especially in the long speech and text sequence when training with few paired data; (4) a unified model structure, which combines all the above components for TTS and ASR based on Transformer model. Our method achieves 99.84\% in terms of word level intelligible rate and 2.68 MOS for TTS, and 11.7\% PER for ASR on LJSpeech dataset, by leveraging only 200 paired speech and text data (about 20 minutes audio), together with extra unpaired speech and text data.
Author Information
Yi Ren (Zhejiang University)
Xu Tan (Microsoft Research)
Tao Qin (Microsoft Research Asia)
Sheng Zhao (Microsoft)
Zhou Zhao (Zhejiang University)
Tie-Yan Liu (Microsoft)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Oral: Almost Unsupervised Text to Speech and Automatic Speech Recognition »
Thu. Jun 13th 05:05 -- 05:10 PM Room Room 201
More from the Same Authors
-
2023 Poster: NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with Spatial-temporal Decomposition »
Xinquan Huang · Wenlei Shi · Qi Meng · Yue Wang · Xiaotian Gao · Jia Zhang · Tie-Yan Liu -
2023 Poster: Retrosynthetic Planning with Dual Value Networks »
Guoqing Liu · Di Xue · Shufang Xie · Yingce Xia · Austin Tripp · Krzysztof Maziarz · Marwin Segler · Tao Qin · Zongzhang Zhang · Tie-Yan Liu -
2023 Poster: Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models »
Rongjie Huang · Jiawei Huang · Dongchao Yang · Yi Ren · Luping Liu · Mingze Li · Zhenhui Ye · Jinglin Liu · Xiang Yin · Zhou Zhao -
2022 Poster: Analyzing and Mitigating Interference in Neural Architecture Search »
Jin Xu · Xu Tan · Kaitao Song · Renqian Luo · Yichong Leng · Tao Qin · Tie-Yan Liu · Jian Li -
2022 Poster: Supervised Off-Policy Ranking »
Yue Jin · Yue Zhang · Tao Qin · Xudong Zhang · Jian Yuan · Houqiang Li · Tie-Yan Liu -
2022 Spotlight: Supervised Off-Policy Ranking »
Yue Jin · Yue Zhang · Tao Qin · Xudong Zhang · Jian Yuan · Houqiang Li · Tie-Yan Liu -
2022 Spotlight: Analyzing and Mitigating Interference in Neural Architecture Search »
Jin Xu · Xu Tan · Kaitao Song · Renqian Luo · Yichong Leng · Tao Qin · Tie-Yan Liu · Jian Li -
2021 Poster: Learning to Rehearse in Long Sequence Memorization »
Zhu Zhang · Chang Zhou · Jianxin Ma · Zhijie Lin · Jingren Zhou · Hongxia Yang · Zhou Zhao -
2021 Spotlight: Learning to Rehearse in Long Sequence Memorization »
Zhu Zhang · Chang Zhou · Jianxin Ma · Zhijie Lin · Jingren Zhou · Hongxia Yang · Zhou Zhao -
2021 Poster: How could Neural Networks understand Programs? »
Dinglan Peng · Shuxin Zheng · Yatao Li · Guolin Ke · Di He · Tie-Yan Liu -
2021 Spotlight: How could Neural Networks understand Programs? »
Dinglan Peng · Shuxin Zheng · Yatao Li · Guolin Ke · Di He · Tie-Yan Liu -
2021 Poster: Temporally Correlated Task Scheduling for Sequence Learning »
Xueqing Wu · Lewen Wang · Yingce Xia · Weiqing Liu · Lijun Wu · Shufang Xie · Tao Qin · Tie-Yan Liu -
2021 Spotlight: Temporally Correlated Task Scheduling for Sequence Learning »
Xueqing Wu · Lewen Wang · Yingce Xia · Weiqing Liu · Lijun Wu · Shufang Xie · Tao Qin · Tie-Yan Liu -
2020 Poster: Sequence Generation with Mixed Representations »
Lijun Wu · Shufang Xie · Yingce Xia · Yang Fan · Jian-Huang Lai · Tao Qin · Tie-Yan Liu -
2019 Poster: MASS: Masked Sequence to Sequence Pre-training for Language Generation »
Kaitao Song · Xu Tan · Tao Qin · Jianfeng Lu · Tie-Yan Liu -
2019 Poster: Adaptive Regret of Convex and Smooth Functions »
Lijun Zhang · Tie-Yan Liu · Zhi-Hua Zhou -
2019 Poster: Efficient Training of BERT by Progressively Stacking »
Linyuan Gong · Di He · Zhuohan Li · Tao Qin · Liwei Wang · Tie-Yan Liu -
2019 Oral: Efficient Training of BERT by Progressively Stacking »
Linyuan Gong · Di He · Zhuohan Li · Tao Qin · Liwei Wang · Tie-Yan Liu -
2019 Oral: MASS: Masked Sequence to Sequence Pre-training for Language Generation »
Kaitao Song · Xu Tan · Tao Qin · Jianfeng Lu · Tie-Yan Liu -
2019 Oral: Adaptive Regret of Convex and Smooth Functions »
Lijun Zhang · Tie-Yan Liu · Zhi-Hua Zhou -
2018 Poster: Towards Binary-Valued Gates for Robust LSTM Training »
Zhuohan Li · Di He · Fei Tian · Wei Chen · Tao Qin · Liwei Wang · Tie-Yan Liu -
2018 Oral: Towards Binary-Valued Gates for Robust LSTM Training »
Zhuohan Li · Di He · Fei Tian · Wei Chen · Tao Qin · Liwei Wang · Tie-Yan Liu -
2018 Poster: Model-Level Dual Learning »
Yingce Xia · Xu Tan · Fei Tian · Tao Qin · Nenghai Yu · Tie-Yan Liu -
2018 Oral: Model-Level Dual Learning »
Yingce Xia · Xu Tan · Fei Tian · Tao Qin · Nenghai Yu · Tie-Yan Liu -
2017 Poster: Asynchronous Stochastic Gradient Descent with Delay Compensation »
Shuxin Zheng · Qi Meng · Taifeng Wang · Wei Chen · Nenghai Yu · Zhiming Ma · Tie-Yan Liu -
2017 Talk: Asynchronous Stochastic Gradient Descent with Delay Compensation »
Shuxin Zheng · Qi Meng · Taifeng Wang · Wei Chen · Nenghai Yu · Zhiming Ma · Tie-Yan Liu -
2017 Poster: Dual Supervised Learning »
Yingce Xia · Tao Qin · Wei Chen · Jiang Bian · Nenghai Yu · Tie-Yan Liu -
2017 Talk: Dual Supervised Learning »
Yingce Xia · Tao Qin · Wei Chen · Jiang Bian · Nenghai Yu · Tie-Yan Liu