Timezone: »

 
Poster
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
Hangbo Bao · Li Dong · Furu Wei · Wenhui Wang · Nan Yang · Xiaodong Liu · Yu Wang · Jianfeng Gao · Songhao Piao · Ming Zhou · Hsiao-Wuen Hon

Thu Jul 16 08:00 AM -- 08:45 AM & Thu Jul 16 07:00 PM -- 07:45 PM (PDT) @ Virtual

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM). Given an input text with masked tokens, we rely on conventional masks to learn inter-relations between corrupted tokens and context via autoencoding, and pseudo masks to learn intra-relations between masked spans via partially autoregressive modeling. With well-designed position embeddings and self-attention masks, the context encodings are reused to avoid redundant computation. Moreover, conventional masks used for autoencoding provide global masking information, so that all the position embeddings are accessible in partially autoregressive language modeling. In addition, the two tasks pre-train a unified language model as a bidirectional encoder and a sequence-to-sequence decoder, respectively. Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of language understanding and generation tasks across several widely used benchmarks. The code and pre-trained models are available at https://github.com/microsoft/unilm.

Author Information

Hangbo Bao (Harbin Institute of Technology)
Li Dong (Microsoft Research)
Furu Wei (Microsoft Research Asia)
Wenhui Wang (Microsoft Research)
Nan Yang (Microsoft Research Asia)
Xiaodong Liu (Microsoft Research)
Yu Wang (Microsoft Research)
Jianfeng Gao (Microsoft Research AI)
Jianfeng Gao

Jianfeng Gao is Partner Research Manager at Microsoft Research AI. He leads the development of AI systems for machine reading comprehension (MRC), question answering (QA), social bots, goal-oriented dialogue, and business applications. From 2014 to 2017, he was Partner Research Manager at Deep Learning Technology Center at Microsoft Research, Redmond, where he was leading the research on deep learning for text and image processing. From 2006 to 2014, he was Principal Researcher at Natural Language Processing Group at Microsoft Research, Redmond, where he worked on Web search, query understanding and reformulation, ads prediction, and statistical machine translation. From 2005 to 2006, he was a Research Lead in Natural Interactive Services Division at Microsoft, where he worked on Project X, an effort of developing natural user interface for Windows. From 2000 to 2005, he was Research Lead in Natural Language Computing Group at Microsoft Research Asia, where he and his colleagues developed the first Chinese speech recognition system released with Microsoft Office, the Chinese/Japanese Input Method Editors (IME) which were the leading products in the market, and the natural language platform for Microsoft Windows.

Songhao Piao (Harbin Institute of Technology)
Ming Zhou (Microsoft Research)
Hsiao-Wuen Hon (Microsoft Research)

More from the Same Authors