Timezone: »
Prosody plays an important role in characterizing the style of a speaker or an emotion, but most non-parallel voice or emotion style transfer algorithms do not convert any prosody information. Two major components of prosody are pitch and rhythm. Disentangling the prosody information, particularly the rhythm component, from the speech is challenging because it involves breaking the synchrony between the input speech and the disentangled speech representation. As a result, most existing prosody style transfer algorithms would need to rely on some form of text transcriptions to identify the content information, which confines their application to high-resource languages only. Recently, SpeechSplit has made sizeable progress towards unsupervised prosody style transfer, but it is unable to extract high-level global prosody style in an unsupervised manner. In this paper, we propose AutoPST, which can disentangle global prosody style from speech without relying on any text transcriptions. AutoPST is an Autoencoder-based Prosody Style Transfer framework with a thorough rhythm removal module guided by the self-expressive representation learning. Experiments on different style transfer tasks show that AutoPST can effectively convert prosody that correctly reflects the styles of the target domains.
Author Information
Kaizhi Qian (MIT-IBM Watson AI Lab)
Yang Zhang (MIT-IBM Watson AI Lab)
Shiyu Chang (UCSB)
Jinjun Xiong (IBM Thomas J. Watson Research Center)
Chuang Gan (MIT-IBM Watson AI Lab)
David Cox (MIT-IBM Watson AI Lab)
Mark Hasegawa-Johnson (University of Illinois)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Oral: Global Prosody Style Transfer Without Text Transcriptions »
Fri. Jul 23rd 12:00 -- 12:20 AM Room
More from the Same Authors
-
2023 Poster: Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modularized Learning »
Zhongzhi Yu · Yang Zhang · Kaizhi Qian · Cheng Wan · Yonggan Fu · Yongan Zhang · Yingyan (Celine) Lin -
2023 Poster: Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models »
Guanhua Zhang · Jiabao Ji · Yang Zhang · Mo Yu · Tommi Jaakkola · Shiyu Chang -
2023 Poster: PromptBoosting: Black-Box Text Classification with Ten Forward Passes »
Bairu Hou · Joe O'Connor · Jacob Andreas · Shiyu Chang · Yang Zhang -
2023 Poster: On the Forward Invariance of Neural ODEs »
Wei Xiao · Tsun-Hsuan Wang · Ramin Hasani · Mathias Lechner · Yutong Ban · Chuang Gan · Daniela Rus -
2023 Poster: Reparameterized Policy Learning for Multimodal Trajectory Optimization »
Zhiao Huang · Litian Liang · Zhan Ling · Xuanlin Li · Chuang Gan · Hao Su -
2023 Poster: Learning Neural Constitutive Laws from Motion Observations for Generalizable PDE Dynamics »
Pingchuan Ma · Peter Yichen Chen · Bolei Deng · Josh Tenenbaum · Tao Du · Chuang Gan · Wojciech Matusik -
2023 Oral: Reparameterized Policy Learning for Multimodal Trajectory Optimization »
Zhiao Huang · Litian Liang · Zhan Ling · Xuanlin Li · Chuang Gan · Hao Su -
2022 Poster: Learning Stable Classifiers by Transferring Unstable Features »
Yujia Bao · Shiyu Chang · Regina Barzilay -
2022 Poster: Data-Efficient Double-Win Lottery Tickets from Robust Pre-training »
Tianlong Chen · Zhenyu Zhang · Sijia Liu · Yang Zhang · Shiyu Chang · Zhangyang “Atlas” Wang -
2022 Poster: Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness »
Tianlong Chen · Huan Zhang · Zhenyu Zhang · Shiyu Chang · Sijia Liu · Pin-Yu Chen · Zhangyang “Atlas” Wang -
2022 Poster: ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers »
Kaizhi Qian · Yang Zhang · Heting Gao · Junrui Ni · Cheng-I Lai · David Cox · Mark Hasegawa-Johnson · Shiyu Chang -
2022 Spotlight: Data-Efficient Double-Win Lottery Tickets from Robust Pre-training »
Tianlong Chen · Zhenyu Zhang · Sijia Liu · Yang Zhang · Shiyu Chang · Zhangyang “Atlas” Wang -
2022 Spotlight: Learning Stable Classifiers by Transferring Unstable Features »
Yujia Bao · Shiyu Chang · Regina Barzilay -
2022 Spotlight: ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers »
Kaizhi Qian · Yang Zhang · Heting Gao · Junrui Ni · Cheng-I Lai · David Cox · Mark Hasegawa-Johnson · Shiyu Chang -
2022 Spotlight: Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness »
Tianlong Chen · Huan Zhang · Zhenyu Zhang · Shiyu Chang · Sijia Liu · Pin-Yu Chen · Zhangyang “Atlas” Wang -
2022 Poster: Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization »
Yihua Zhang · Guanhua Zhang · Prashant Khanduri · Mingyi Hong · Shiyu Chang · Sijia Liu -
2022 Spotlight: Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization »
Yihua Zhang · Guanhua Zhang · Prashant Khanduri · Mingyi Hong · Shiyu Chang · Sijia Liu -
2022 Poster: Forget-free Continual Learning with Winning Subnetworks »
Haeyong Kang · Rusty Mina · Sultan Rizky Hikmawan Madjid · Jaehong Yoon · Mark Hasegawa-Johnson · Sung Ju Hwang · Chang Yoo -
2022 Poster: Prompting Decision Transformer for Few-Shot Policy Generalization »
Mengdi Xu · Yikang Shen · Shun Zhang · Yuchen Lu · Ding Zhao · Josh Tenenbaum · Chuang Gan -
2022 Spotlight: Forget-free Continual Learning with Winning Subnetworks »
Haeyong Kang · Rusty Mina · Sultan Rizky Hikmawan Madjid · Jaehong Yoon · Mark Hasegawa-Johnson · Sung Ju Hwang · Chang Yoo -
2022 Spotlight: Prompting Decision Transformer for Few-Shot Policy Generalization »
Mengdi Xu · Yikang Shen · Shun Zhang · Yuchen Lu · Ding Zhao · Josh Tenenbaum · Chuang Gan -
2021 Poster: Predict then Interpolate: A Simple Algorithm to Learn Stable Classifiers »
Yujia Bao · Shiyu Chang · Regina Barzilay -
2021 Spotlight: Predict then Interpolate: A Simple Algorithm to Learn Stable Classifiers »
Yujia Bao · Shiyu Chang · Regina Barzilay -
2021 Poster: Adversarial Option-Aware Hierarchical Imitation Learning »
Mingxuan Jing · Wenbing Huang · Fuchun Sun · Xiaojian Ma · Tao Kong · Chuang Gan · Lei Li -
2021 Poster: AGENT: A Benchmark for Core Psychological Reasoning »
Tianmin Shu · Abhishek Bhandwaldar · Chuang Gan · Kevin Smith · Shari Liu · Dan Gutfreund · Elizabeth Spelke · Josh Tenenbaum · Tomer Ullman -
2021 Spotlight: AGENT: A Benchmark for Core Psychological Reasoning »
Tianmin Shu · Abhishek Bhandwaldar · Chuang Gan · Kevin Smith · Shari Liu · Dan Gutfreund · Elizabeth Spelke · Josh Tenenbaum · Tomer Ullman -
2021 Spotlight: Adversarial Option-Aware Hierarchical Imitation Learning »
Mingxuan Jing · Wenbing Huang · Fuchun Sun · Xiaojian Ma · Tao Kong · Chuang Gan · Lei Li -
2021 Poster: Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators »
Yonggan Fu · Yongan Zhang · Yang Zhang · David Cox · Yingyan Lin -
2021 Spotlight: Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators »
Yonggan Fu · Yongan Zhang · Yang Zhang · David Cox · Yingyan Lin -
2020 Poster: Invariant Rationalization »
Shiyu Chang · Yang Zhang · Mo Yu · Tommi Jaakkola -
2020 Poster: Proper Network Interpretability Helps Adversarial Robustness in Classification »
Akhilan Boopathy · Sijia Liu · Gaoyuan Zhang · Cynthia Liu · Pin-Yu Chen · Shiyu Chang · Luca Daniel -
2020 Poster: Unsupervised Speech Decomposition via Triple Information Bottleneck »
Kaizhi Qian · Yang Zhang · Shiyu Chang · Mark Hasegawa-Johnson · David Cox -
2020 Poster: Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case »
shuai zhang · Meng Wang · Sijia Liu · Pin-Yu Chen · Jinjun Xiong -
2019 Poster: AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss »
Kaizhi Qian · Yang Zhang · Shiyu Chang · Xuesong Yang · Mark Hasegawa-Johnson -
2019 Oral: AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss »
Kaizhi Qian · Yang Zhang · Shiyu Chang · Xuesong Yang · Mark Hasegawa-Johnson