Oral
Parallel WaveNet: Fast High-Fidelity Speech Synthesis
Aäron van den Oord · Yazhe Li · Igor Babuschkin · Karen Simonyan · Oriol Vinyals · koray kavukcuoglu · George van den Driessche · Edward Lockhart · Luis C Cobo · Florian Stimberg · Norman Casagrande · Dominik Grewe · Seb Noury · Sander Dieleman · Erich Elsen · Nal Kalchbrenner · Heiga Zen · Alex Graves · Helen King · Tom Walters · Dan Belov · Demis Hassabis

Fri Jul 13th 04:00 -- 04:20 PM @ A7

The recently-developed WaveNet architecture is the current state of the art in realistic speech synthesis, consistently rated as more natural sounding for many different languages than any previous system.However, because WaveNet relies on sequential generation of one audio sample at a time, it is poorly suited to today's massively parallel computers, and therefore hard to deploy in a real-time production setting.This paper introduces Probability Density Distillation, a new methodfor training a parallel feed-forward network from a trained WaveNet with no significant difference in quality.The resulting system is capable of generating high-fidelity speech samples at more than 20 times faster than real-time, a 1000x speed up relative to the original WaveNet, and capable of serving multiple English and Japanese voices in a production setting.

Author Information

Aäron van den Oord (Google Deepmind)
Yazhe Li (Deepmind)
Igor Babuschkin (DeepMind)
Karen Simonyan (DeepMind)
Oriol Vinyals (DeepMind)

Oriol Vinyals is a Research Scientist at Google. He works in deep learning with the Google Brain team. Oriol holds a Ph.D. in EECS from University of California, Berkeley, and a Masters degree from University of California, San Diego. He is a recipient of the 2011 Microsoft Research PhD Fellowship. He was an early adopter of the new deep learning wave at Berkeley, and in his thesis he focused on non-convex optimization and recurrent neural networks. At Google Brain he continues working on his areas of interest, which include artificial intelligence, with particular emphasis on machine learning, language, and vision.

koray kavukcuoglu (DeepMind)
George van den Driessche (DeepMind)
Edward Lockhart
Luis C Cobo (DeepMind)
Florian Stimberg
Norman Casagrande (DeepMind)
Dominik Grewe
Seb Noury (DeepMind)
Sander Dieleman (DeepMind)
Erich Elsen
Nal Kalchbrenner (Google Brain Amsterdam)
Heiga Zen
Alex Graves (DeepMind)
Helen King (DeepMind)
Tom Walters (DeepMind)
Dan Belov (Google)
Demis Hassabis (Deepmind)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors