Timezone: »
The well-known Gumbel-Max trick for sampling from a categorical distribution can be extended to sample k elements without replacement. We show how to implicitly apply this `Gumbel-Top-k' trick on a factorized distribution over sequences, allowing to draw exact samples without replacement using a Stochastic Beam Search. Even for exponentially large domains, the number of model evaluations grows only linear in k and the maximum sampled sequence length. The algorithm creates a theoretical connection between sampling and beam search and can be used as a principled intermediate alternative. In a translation task, we show that the proposed method compares favourably against alternatives to obtain diverse yet good quality translations. We show that sequences sampled without replacement can be used to construct low-variance estimators for expected sentence-level BLEU score and model entropy.
Author Information
Wouter Kool (University of Amsterdam)
Herke van Hoof (University of Amsterdam)
Max Welling (University of Amsterdam)
Prof. Dr. Max Welling is a research chair in Machine Learning at the University of Amsterdam and a VP Technologies at Qualcomm. He has a secondary appointment as a senior fellow at the Canadian Institute for Advanced Research (CIFAR). He is co-founder of “Scyfer BV” a university spin-off in deep learning which got acquired by Qualcomm in summer 2017. In the past he held postdoctoral positions at Caltech (’98-’00), UCL (’00-’01) and the U. Toronto (’01-’03). He received his PhD in ’98 under supervision of Nobel laureate Prof. G. 't Hooft. Max Welling has served as associate editor in chief of IEEE TPAMI from 2011-2015 (impact factor 4.8). He serves on the board of the NIPS foundation since 2015 (the largest conference in machine learning) and has been program chair and general chair of NIPS in 2013 and 2014 respectively. He was also program chair of AISTATS in 2009 and ECCV in 2016 and general chair of MIDL 2018. He has served on the editorial boards of JMLR and JML and was an associate editor for Neurocomputing, JCGS and TPAMI. He received multiple grants from Google, Facebook, Yahoo, NSF, NIH, NWO and ONR-MURI among which an NSF career grant in 2005. He is recipient of the ECCV Koenderink Prize in 2010. Welling is in the board of the Data Science Research Center in Amsterdam, he directs the Amsterdam Machine Learning Lab (AMLAB), and co-directs the Qualcomm-UvA deep learning lab (QUVA) and the Bosch-UvA Deep Learning lab (DELTA). Max Welling has over 200 scientific publications in machine learning, computer vision, statistics and physics.
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Stochastic Beams and Where To Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement »
Fri. Jun 14th 01:30 -- 04:00 AM Room Pacific Ballroom #41
More from the Same Authors
-
2023 Workshop: Structured Probabilistic Inference and Generative Modeling »
Dinghuai Zhang · Yuanqi Du · Chenlin Meng · Shawn Tan · Yingzhen Li · Max Welling · Yoshua Bengio -
2022 Poster: Lie Point Symmetry Data Augmentation for Neural PDE Solvers »
Johannes Brandstetter · Max Welling · Daniel Worrall -
2022 Spotlight: Lie Point Symmetry Data Augmentation for Neural PDE Solvers »
Johannes Brandstetter · Max Welling · Daniel Worrall -
2022 Poster: Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search »
Qi Wang · Herke van Hoof -
2022 Spotlight: Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search »
Qi Wang · Herke van Hoof -
2021 Test Of Time: Test of Time Award »
Max Welling · Max Welling -
2021 Poster: Deep Coherent Exploration for Continuous Control »
Yijie Zhang · Herke van Hoof -
2021 Spotlight: Deep Coherent Exploration for Continuous Control »
Yijie Zhang · Herke van Hoof -
2021 Poster: A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups »
Marc Finzi · Max Welling · Andrew Wilson -
2021 Oral: A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups »
Marc Finzi · Max Welling · Andrew Wilson -
2020 Poster: Doubly Stochastic Variational Inference for Neural Processes with Hierarchical Latent Variables »
Qi Wang · Herke van Hoof -
2019 Workshop: Learning and Reasoning with Graph-Structured Representations »
Ethan Fetaya · Zhiting Hu · Thomas Kipf · Yujia Li · Xiaodan Liang · Renjie Liao · Raquel Urtasun · Hao Wang · Max Welling · Eric Xing · Richard Zemel -
2019 : Networking Lunch (provided) + Poster Session »
Abraham Stanway · Alex Robson · Aneesh Rangnekar · Ashesh Chattopadhyay · Ashley Pilipiszyn · Benjamin LeRoy · Bolong Cheng · Ce Zhang · Chaopeng Shen · Christian Schroeder · Christian Clough · Clement DUHART · Clement Fung · Cozmin Ududec · Dali Wang · David Dao · di wu · Dimitrios Giannakis · Dino Sejdinovic · Doina Precup · Duncan Watson-Parris · Gege Wen · George Chen · Gopal Erinjippurath · Haifeng Li · Han Zou · Herke van Hoof · Hillary A Scannell · Hiroshi Mamitsuka · Hongbao Zhang · Jaegul Choo · James Wang · James Requeima · Jessica Hwang · Jinfan Xu · Johan Mathe · Jonathan Binas · Joonseok Lee · Kalai Ramea · Kate Duffy · Kevin McCloskey · Kris Sankaran · Lester Mackey · Letif Mones · Loubna Benabbou · Lynn Kaack · Matthew Hoffman · Mayur Mudigonda · Mehrdad Mahdavi · Michael McCourt · Mingchao Jiang · Mohammad Mahdi Kamani · Neel Guha · Niccolo Dalmasso · Nick Pawlowski · Nikola Milojevic-Dupont · Paulo Orenstein · Pedram Hassanzadeh · Pekka Marttinen · Ramesh Nair · Sadegh Farhang · Samuel Kaski · Sandeep Manjanna · Sasha Luccioni · Shuby Deshpande · Soo Kim · Soukayna Mouatadid · Sunghyun Park · Tao Lin · Telmo Felgueira · Thomas Hornigold · Tianle Yuan · Tom Beucler · Tracy Cui · Volodymyr Kuleshov · Wei Yu · yang song · Ydo Wexler · Yoshua Bengio · Zhecheng Wang · Zhuangfang Yi · Zouheir Malki -
2018 Invited Talk: Intelligence per Kilowatthour »
Max Welling -
2018 Poster: BOCK : Bayesian Optimization with Cylindrical Kernels »
ChangYong Oh · Efstratios Gavves · Max Welling -
2018 Oral: BOCK : Bayesian Optimization with Cylindrical Kernels »
ChangYong Oh · Efstratios Gavves · Max Welling -
2017 Poster: Multiplicative Normalizing Flows for Variational Bayesian Neural Networks »
Christos Louizos · Max Welling -
2017 Talk: Multiplicative Normalizing Flows for Variational Bayesian Neural Networks »
Christos Louizos · Max Welling