Timezone: »
Because of data scarcity in real-world scenarios, obtaining pre-trained representations via self-supervised learning (SSL) has attracted increasing interest. Although various methods have been proposed, it is still under-explored what knowledge the networks learn from the pre-training tasks and how it relates to downstream properties. In this work, with an emphasis on chemical molecular graphs, we fill in this gap by devising a range of node-level, pair-level, and graph-level probe tasks to analyse the representations from pre-trained graph neural networks (GNNs). We empirically show that: 1. Pre-trained models have better downstream performance compared to randomly-initialised models due to their improved the capability of capturing global topology and recognising substructures. 2. However, randomly initialised models outperform pre-trained models in terms of retaining local topology. Such information gradually disappears from the early layers to the last layers for pre-trained models.
Author Information
Hanchen Wang (Cambridge; Caltech)
Joint PostDoc, ML for Genomics
Shengchao Liu (Mila, Université de Montréal)
Jean Kaddour (UCL)
Qi Liu (Department of Computer Science, University of Oxford)
Jian Tang (Mila)
Matt Kusner (University College London)
Joan Lasenby (University of Cambridge)
More from the Same Authors
-
2022 : GAUCHE: A Library for Gaussian Processes in Chemistry »
Ryan-Rhys Griffiths · Leo Klarner · Henry Moss · Aditya Ravuri · Sang Truong · Yuanqi Du · Arian Jamasb · Julius Schwartz · Austin Tripp · Bojana Ranković · Philippe Schwaller · Gregory Kell · Anthony Bourached · Alexander Chan · Jacob Moss · Chengzhi Guo · Alpha Lee · Jian Tang -
2022 : Adapting to Shifts in Latent Confounders via Observed Concepts and Proxies »
Matt Kusner · Ibrahim Alabdulmohsin · Stephen Pfohl · Olawale Salaudeen · Arthur Gretton · Sanmi Koyejo · Jessica Schrouff · Alexander D'Amour -
2022 : Flaky Performances when Pre-Training on Relational Databases with a Plan for Future Characterization Efforts »
Shengchao Liu · David Vazquez · Jian Tang · Pierre-André Noël -
2022 : Protein Representation Learning by Geometric Structure Pretraining »
Zuobai Zhang · Zuobai Zhang · Minghao Xu · Minghao Xu · Arian Jamasb · Arian Jamasb · Vijil Chenthamarakshan · Vijil Chenthamarakshan · Aurelie Lozano · Payel Das · Payel Das · Jian Tang · Jian Tang -
2022 : Evaluating Self-Supervised Learned Molecular Graphs »
Hanchen Wang · Hanchen Wang · Shengchao Liu · Shengchao Liu · Jean Kaddour · Jean Kaddour · Qi Liu · Qi Liu · Jian Tang · Jian Tang · Matt Kusner · Matt Kusner · Joan Lasenby · Joan Lasenby -
2023 Poster: A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining »
Shengchao Liu · weitao du · Zhiming Ma · Hongyu Guo · Jian Tang -
2023 Poster: FusionRetro: Molecule Representation Fusion via In-Context Learning for Retrosynthetic Planning »
Songtao Liu · Zhengkai Tu · Minkai Xu · Zuobai Zhang · Lu Lin · Rex Ying · Jian Tang · Peilin Zhao · Dinghao Wu -
2023 Poster: ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts »
Minghao Xu · Xinyu Yuan · Santiago Miret · Jian Tang -
2023 Oral: ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts »
Minghao Xu · Xinyu Yuan · Santiago Miret · Jian Tang -
2022 Workshop: AI for Science »
Yuanqi Du · Tianfan Fu · Wenhao Gao · Kexin Huang · Shengchao Liu · Ziming Liu · Hanchen Wang · Connor Coley · Le Song · Linfeng Zhang · Marinka Zitnik -
2022 Workshop: The First Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward »
Huaxiu Yao · Hugo Larochelle · Percy Liang · Colin Raffel · Jian Tang · Ying WEI · Saining Xie · Eric Xing · Chelsea Finn -
2022 Poster: Generative Coarse-Graining of Molecular Conformations »
Wujie Wang · Minkai Xu · Chen Cai · Benjamin Kurt Miller · Tess Smidt · Yusu Wang · Jian Tang · Rafael Gomez-Bombarelli -
2022 Spotlight: Generative Coarse-Graining of Molecular Conformations »
Wujie Wang · Minkai Xu · Chen Cai · Benjamin Kurt Miller · Tess Smidt · Yusu Wang · Jian Tang · Rafael Gomez-Bombarelli -
2020 Poster: Learning to Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning »
Sai Krishna Gottipati · Boris Sattarov · Sufeng Niu · Yashaswi Pathak · Haoran Wei · Shengchao Liu · Shengchao Liu · Simon Blackburn · Karam Thomas · Connor Coley · Jian Tang · Sarath Chandar · Yoshua Bengio