Timezone: »

 
Spotlight
AutoAttend: Automated Attention Representation Search
Chaoyu Guan · Xin Wang · Wenwu Zhu

Tue Jul 20 06:35 PM -- 06:40 PM (PDT) @

Self-attention mechanisms have been widely adopted in many machine learning areas, including Natural Language Processing (NLP) and Graph Representation Learning (GRL), etc. However, existing works heavily rely on hand-crafted design to obtain customized attention mechanisms. In this paper, we automate Key, Query and Value representation design, which is one of the most important steps to obtain effective self-attentions. We propose an automated self-attention representation model, AutoAttend, which can automatically search powerful attention representations for downstream tasks leveraging Neural Architecture Search (NAS). In particular, we design a tailored search space for attention representation automation, which is flexible to produce effective attention representation designs. Based on the design prior obtained from attention representations in previous works, we further regularize our search space to reduce the space complexity without the loss of expressivity. Moreover, we propose a novel context-aware parameter sharing mechanism considering special characteristics of each sub-architecture to provide more accurate architecture estimations when conducting parameter sharing in our tailored search space. Experiments show the superiority of our proposed AutoAttend model over previous state-of-the-arts on eight text classification tasks in NLP and four node classification tasks in GRL.

Author Information

Chaoyu Guan (Tsinghua University)
Xin Wang (Tsinghua University)
Wenwu Zhu (Tsinghua University)

Wenwu Zhu is currently a Professor of Computer Science Department of Tsinghua University and Vice Dean of National Research Center on Information Science and Technology. Prior to his current post, he was a Senior Researcher and Research Manager at Microsoft Research Asia. He was the Chief Scientist and Director at Intel Research China from 2004 to 2008. He worked at Bell Labs New Jersey as a Member of Technical Staff during 1996-1999. He has been serving as the chair of the steering committee for IEEE T-MM since January 1, 2020. He served as the Editor-in-Chief for the IEEE Transactions on Multimedia (T-MM) from 2017 to 2019. And Vice EiC for IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) from 2020-2021 He served as co-Chair for ACM MM 2018 and co-Chair for ACM CIKM 2019. His current research interests are in the areas of multimodal big data and intelligence, and multimedia networking. He received 10 Best Paper Awards. He is a member of Academia Europaea, an IEEE Fellow, AAAS Fellow, and SPIE Fellow.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors