ICML The Efficacy of Pre-training in Chemical Graph Out-of-distribution Generalization

Poster
in
Workshop: AI for Science: Scaling in AI for Scientific Discovery

The Efficacy of Pre-training in Chemical Graph Out-of-distribution Generalization

Qi Liu · Rosa Chan · Rose Yu

Keywords: [ Self-Supervised Pre-training ] [ Out-of-Distribution Generalization ] [ Graph Neural Networks ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Graph neural networks have shown significant progress in various tasks, yet their ability to generalize in out-of-distribution (OOD) scenarios remains an open question. In this study, we conduct a comprehensive benchmarking of the efficacy of chemical graph pre-trained models in the context of OOD challenges, named as PODGenGraph. We conduct extensive experiments across diverse chemical graph datasets, encompassing different graph sizes. Our benchmark is framed around distinct distribution shifts, including both concept and covariate shifts, whilst also varying the degree of shift. Our findings are striking: even basic pre-trained models exhibit performance that is not only comparable to, but often surpasses, specifically designed models to handle distribution shift. We further investigate the results, examining the influence of the key factors (e.g., sample size, learning rates, in-distribution performance etc.) of pre-trained models for OOD generalization. In general, our work shows that pre-training could be a flexible and simple approach to OOD generalization in chemical graph learning. Leveraging pre-trained models together for chemical graph OOD generalization in real-world applications stands as a promising avenue for future research.

Chat is not available.

Poster in Workshop: AI for Science: Scaling in AI for Scientific Discovery

The Efficacy of Pre-training in Chemical Graph Out-of-distribution Generalization

Qi Liu · Rosa Chan · Rose Yu

Poster
in
Workshop: AI for Science: Scaling in AI for Scientific Discovery