Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Data-centric Machine Learning Research (DMLR): Datasets for Foundation Models

A data-centric approach for assessing progress of Graph Neural Networks

Tianqi Zhao · Thi Ngan Dong · Alan Hanjalic · Megha Khosla


Abstract:

Graph Neural Networks (GNNs) have achieved state-of-the-art results in node classification tasks. However, most improvements are in multi-class classification, with less focus on scenarios where nodes have multiple labels. The first challenge in studying multi-label node classification is the scarcity of publicly available datasets. To address this, we collected and released three real-world biological datasets and developed a multi-label graph generator with tunable properties. We also argue that traditional notions of homophily and heterophily do not apply well to multi-label scenarios. Therefore, we define homophily and Cross-Class Neighborhood Similarity for multi-label classification and analyze nine collected datasets. Lastly, we conducted a large-scale comparative study with eight methods across nine datasets to evaluate current progress in multi-label node classification.

Chat is not available.