Timezone: »
We introduce a new beam search decoder that is fully differentiable, making it possible to optimize at training time through the inference procedure. Our decoder allows us to combine models which operates at different granularity (e.g. acoustic and language models). It also handles an arbitrary number of target sequence candidates, making it suitable in a context where labeled data is not aligned to input sequences. We demonstrate our approach scales by applying it to speech recognition, jointly training acoustic and word-level language models. The system is end-to-end, with gradients flowing through the whole architecture from the word-level transcriptions. Recent research efforts have shown that deep neural networks with attention-based mechanisms are powerful enough to successfully train an acoustic model from the final transcription, while implicitly learning a language model. Instead, we show that it is possible to discriminatively train an acoustic model jointly with an \emph{explicit} and possibly pre-trained language model.
Author Information
Ronan Collobert (Facebook AI Research)
Awni Hannun (Facebook AI Research)
Gabriel Synnaeve (Facebook AI Research)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: A fully differentiable beam search decoder »
Fri. Jun 14th 01:30 -- 04:00 AM Room Pacific Ballroom #226
More from the Same Authors
-
2022 Poster: Flashlight: Enabling Innovation in Tools for Machine Learning »
Jacob Kahn · Vineel Pratap · Tatiana Likhomanenko · Qiantong Xu · Awni Hannun · Jeff Cai · Paden Tomasello · Ann Lee · Edouard Grave · Gilad Avidov · Benoit Steiner · Vitaliy Liptchinsky · Gabriel Synnaeve · Ronan Collobert -
2022 Spotlight: Flashlight: Enabling Innovation in Tools for Machine Learning »
Jacob Kahn · Vineel Pratap · Tatiana Likhomanenko · Qiantong Xu · Awni Hannun · Jeff Cai · Paden Tomasello · Ann Lee · Edouard Grave · Gilad Avidov · Benoit Steiner · Vitaliy Liptchinsky · Gabriel Synnaeve · Ronan Collobert -
2020 Poster: Certified Data Removal from Machine Learning Models »
Chuan Guo · Tom Goldstein · Awni Hannun · Laurens van der Maaten -
2020 Poster: Growing Action Spaces »
Gregory Farquhar · Laura Gustafson · Zeming Lin · Shimon Whiteson · Nicolas Usunier · Gabriel Synnaeve -
2020 Poster: Word-Level Speech Recognition With a Letter to Word Encoder »
Ronan Collobert · Awni Hannun · Gabriel Synnaeve -
2017 Workshop: Video Games and Machine Learning »
Gabriel Synnaeve · Julian Togelius · Tom Schaul · Oriol Vinyals · Nicolas Usunier