Timezone: »
Self-attention mechanisms have been widely adopted in many machine learning areas, including Natural Language Processing (NLP) and Graph Representation Learning (GRL), etc. However, existing works heavily rely on hand-crafted design to obtain customized attention mechanisms. In this paper, we automate Key, Query and Value representation design, which is one of the most important steps to obtain effective self-attentions. We propose an automated self-attention representation model, AutoAttend, which can automatically search powerful attention representations for downstream tasks leveraging Neural Architecture Search (NAS). In particular, we design a tailored search space for attention representation automation, which is flexible to produce effective attention representation designs. Based on the design prior obtained from attention representations in previous works, we further regularize our search space to reduce the space complexity without the loss of expressivity. Moreover, we propose a novel context-aware parameter sharing mechanism considering special characteristics of each sub-architecture to provide more accurate architecture estimations when conducting parameter sharing in our tailored search space. Experiments show the superiority of our proposed AutoAttend model over previous state-of-the-arts on eight text classification tasks in NLP and four node classification tasks in GRL.
Author Information
Chaoyu Guan (Tsinghua University)
Xin Wang (Tsinghua University)
Wenwu Zhu (Tsinghua University)
Wenwu Zhu is currently a Professor of Computer Science Department of Tsinghua University and Vice Dean of National Research Center on Information Science and Technology. Prior to his current post, he was a Senior Researcher and Research Manager at Microsoft Research Asia. He was the Chief Scientist and Director at Intel Research China from 2004 to 2008. He worked at Bell Labs New Jersey as a Member of Technical Staff during 1996-1999. He has been serving as the chair of the steering committee for IEEE T-MM since January 1, 2020. He served as the Editor-in-Chief for the IEEE Transactions on Multimedia (T-MM) from 2017 to 2019. And Vice EiC for IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) from 2020-2021 He served as co-Chair for ACM MM 2018 and co-Chair for ACM CIKM 2019. His current research interests are in the areas of multimodal big data and intelligence, and multimedia networking. He received 10 Best Paper Awards. He is a member of Academia Europaea, an IEEE Fellow, AAAS Fellow, and SPIE Fellow.
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: AutoAttend: Automated Attention Representation Search »
Wed. Jul 21st 04:00 -- 06:00 AM Room
More from the Same Authors
-
2023 Poster: Curriculum Co-disentangled Representation Learning across Multiple Environments for Social Recommendation »
Xin Wang · Zirui Pan · Yuwei Zhou · Hong Chen · Chendi Ge · Wenwu Zhu -
2023 Poster: Wasserstein Barycenter Matching for Graph Size Generalization of Message Passing Neural Networks »
Xu Chu · Yujie Jin · Xin Wang · Shanghang Zhang · Yasha Wang · Wenwu Zhu · Hong Mei -
2022 Poster: Graph Neural Architecture Search Under Distribution Shifts »
Yijian Qin · Xin Wang · Ziwei Zhang · Pengtao Xie · Wenwu Zhu -
2022 Spotlight: Graph Neural Architecture Search Under Distribution Shifts »
Yijian Qin · Xin Wang · Ziwei Zhang · Pengtao Xie · Wenwu Zhu -
2022 Poster: Auxiliary Learning with Joint Task and Data Scheduling »
Hong Chen · Xin Wang · Chaoyu Guan · Yue Liu · Wenwu Zhu -
2022 Spotlight: Auxiliary Learning with Joint Task and Data Scheduling »
Hong Chen · Xin Wang · Chaoyu Guan · Yue Liu · Wenwu Zhu -
2022 Poster: DNA: Domain Generalization with Diversified Neural Averaging »
Xu Chu · Yujie Jin · Wenwu Zhu · Yasha Wang · Xin Wang · Shanghang Zhang · Hong Mei -
2022 Poster: Parametric Visual Program Induction with Function Modularization »
Xuguang Duan · Xin Wang · Ziwei Zhang · Wenwu Zhu -
2022 Poster: Large-Scale Graph Neural Architecture Search »
Chaoyu Guan · Xin Wang · Hong Chen · Ziwei Zhang · Wenwu Zhu -
2022 Spotlight: Large-Scale Graph Neural Architecture Search »
Chaoyu Guan · Xin Wang · Hong Chen · Ziwei Zhang · Wenwu Zhu -
2022 Spotlight: Parametric Visual Program Induction with Function Modularization »
Xuguang Duan · Xin Wang · Ziwei Zhang · Wenwu Zhu -
2022 Spotlight: DNA: Domain Generalization with Diversified Neural Averaging »
Xu Chu · Yujie Jin · Wenwu Zhu · Yasha Wang · Xin Wang · Shanghang Zhang · Hong Mei -
2021 Workshop: ICML Workshop on Human in the Loop Learning (HILL) »
Trevor Darrell · Xin Wang · Li Erran Li · Fisher Yu · Zeynep Akata · Wenwu Zhu · Pradeep Ravikumar · Shiji Zhou · Shanghang Zhang · Kalesha Bullard -
2021 Poster: Explainable Automated Graph Representation Learning with Hyperparameter Importance »
Xin Wang · Shuyi Fan · Kun Kuang · Wenwu Zhu -
2021 Spotlight: Explainable Automated Graph Representation Learning with Hyperparameter Importance »
Xin Wang · Shuyi Fan · Kun Kuang · Wenwu Zhu -
2020 : Invited Talk 10: Prof. Wenwu Zhu from Tsinghua University »
Wenwu Zhu -
2019 : Poster Session 1 (all papers) »
Matilde Gargiani · Yochai Zur · Chaim Baskin · Evgenii Zheltonozhskii · Liam Li · Ameet Talwalkar · Xuedong Shang · Harkirat Singh Behl · Atilim Gunes Baydin · Ivo Couckuyt · Tom Dhaene · Chieh Lin · Wei Wei · Min Sun · Orchid Majumder · Michele Donini · Yoshihiko Ozaki · Ryan P. Adams · Christian Geißler · Ping Luo · zhanglin peng · · Ruimao Zhang · John Langford · Rich Caruana · Debadeepta Dey · Charles Weill · Xavi Gonzalvo · Scott Yang · Scott Yak · Eugen Hotaj · Vladimir Macko · Mehryar Mohri · Corinna Cortes · Stefan Webb · Jonathan Chen · Martin Jankowiak · Noah Goodman · Aaron Klein · Frank Hutter · Mojan Javaheripi · Mohammad Samragh · Sungbin Lim · Taesup Kim · SUNGWOONG KIM · Michael Volpp · Iddo Drori · Yamuna Krishnamurthy · Kyunghyun Cho · Stanislaw Jastrzebski · Quentin de Laroussilhe · Mingxing Tan · Xiao Ma · Neil Houlsby · Andrea Gesmundo · Zalán Borsos · Krzysztof Maziarz · Felipe Petroski Such · Joel Lehman · Kenneth Stanley · Jeff Clune · Pieter Gijsbers · Joaquin Vanschoren · Felix Mohr · Eyke Hüllermeier · Zheng Xiong · Wenpeng Zhang · Wenwu Zhu · Weijia Shao · Aleksandra Faust · Michal Valko · Michael Y Li · Hugo Jair Escalante · Marcel Wever · Andrey Khorlin · Tara Javidi · Anthony Francis · Saurajit Mukherjee · Jungtaek Kim · Michael McCourt · Saehoon Kim · Tackgeun You · Seungjin Choi · Nicolas Knudde · Alexander Tornede · Ghassen Jerfel -
2019 Poster: Disentangled Graph Convolutional Networks »
Jianxin Ma · Peng Cui · Kun Kuang · Xin Wang · Wenwu Zhu -
2019 Oral: Disentangled Graph Convolutional Networks »
Jianxin Ma · Peng Cui · Kun Kuang · Xin Wang · Wenwu Zhu -
2017 Poster: Projection-free Distributed Online Learning in Networks »
Wenpeng Zhang · Peilin Zhao · Wenwu Zhu · Steven Hoi · Tong Zhang -
2017 Talk: Projection-free Distributed Online Learning in Networks »
Wenpeng Zhang · Peilin Zhao · Wenwu Zhu · Steven Hoi · Tong Zhang