Timezone: »
Mixture models trained via EM are among the simplest, most widely used and well understood latent variable models in the machine learning literature. Surprisingly, these models have been hardly explored in text generation applications such as machine translation. In principle, they provide a latent variable to control generation and produce a diverse set of hypotheses. In practice, however, mixture models are prone to degeneracies---often only one component gets trained or the latent variable is simply ignored. We find that disabling dropout noise in responsibility computation is critical to successful training. In addition, the design choices of parameterization, prior distribution, hard versus soft EM and online versus offline assignment can dramatically affect model performance. We develop an evaluation protocol to assess both quality and diversity of generations against multiple references, and provide an extensive empirical study of several mixture model variants. Our analysis shows that certain types of mixture models are more robust and offer the best trade-off between translation quality and diversity compared to variational models and diverse decoding approaches.\footnote{Our code will be made publicly available after the review process.}
Author Information
Tianxiao Shen (MIT)
Myle Ott (Facebook AI Research)
Michael Auli (Facebook)
Marc'Aurelio Ranzato (Facebook)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Mixture Models for Diverse Machine Translation: Tricks of the Trade »
Fri. Jun 14th 01:30 -- 04:00 AM Room Pacific Ballroom #106
More from the Same Authors
-
2023 Poster: Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language »
Alexei Baevski · Arun Babu · Wei-Ning Hsu · Michael Auli -
2023 Oral: Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language »
Alexei Baevski · Arun Babu · Wei-Ning Hsu · Michael Auli -
2022 Poster: data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language »
Alexei Baevski · Wei-Ning Hsu · Qiantong Xu · Arun Babu · Jiatao Gu · Michael Auli -
2022 Oral: data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language »
Alexei Baevski · Wei-Ning Hsu · Qiantong Xu · Arun Babu · Jiatao Gu · Michael Auli -
2020 Poster: Educating Text Autoencoders: Latent Representation Guidance via Denoising »
Tianxiao Shen · Jonas Mueller · Regina Barzilay · Tommi Jaakkola -
2018 Poster: Analyzing Uncertainty in Neural Machine Translation »
Myle Ott · Michael Auli · David Grangier · Marc'Aurelio Ranzato -
2018 Oral: Analyzing Uncertainty in Neural Machine Translation »
Myle Ott · Michael Auli · David Grangier · Marc'Aurelio Ranzato -
2017 Poster: Convolutional Sequence to Sequence Learning »
Jonas Gehring · Michael Auli · David Grangier · Denis Yarats · Yann Dauphin -
2017 Poster: Language Modeling with Gated Convolutional Networks »
Yann Dauphin · Angela Fan · Michael Auli · David Grangier -
2017 Talk: Convolutional Sequence to Sequence Learning »
Jonas Gehring · Michael Auli · David Grangier · Denis Yarats · Yann Dauphin -
2017 Talk: Language Modeling with Gated Convolutional Networks »
Yann Dauphin · Angela Fan · Michael Auli · David Grangier