Discriminative Mixture-of-Experts on Graphs with Reliable Expert Fusion
Haoyue Deng ⋅ Menghui Wang ⋅ Yunlong Zhou ⋅ Ziwei Zhang ⋅ Ran Zhang ⋅ Chunming Hu ⋅ Xiao Wang
Abstract
Graph Mixture-of-Experts (Graph-MoE) offers a way to scale GNNs via adaptive capacity allocation, with the goal of allowing different experts to capture diverse graph patterns. Its effectiveness heavily depends on the coordination between routing decisions and expert specialization. However, through extensive empirical study, we identify two critical phenomena. First, discrimination loss occurs on both the expert and routing sides, where GNN experts become highly homogenized and the router collapses to a small subset of experts, failing to reflect diverse graph semantics. Second, routing uncertainty is prevalent, as existing routers produce uncertain expert assignments for most nodes, and such uncertainty exhibits a strong negative correlation with model performance. To address these issues, we propose C$^2$GMoE, a novel **G**raph-**MoE** framework featuring **C**ontrastive routing and **C**onfidence-aware fusion. We introduce a group-wise contrastive routing strategy that provides explicit guidance for routing optimization by aligning node-level routing decisions with semantic clusters while satisfying load-balancing constraints. Moreover, through a theoretical analysis of generalization error, we develop a confidence-aware fusion mechanism that adaptively reweights expert predictions according to their confidence. Extensive experiments across multiple benchmarks demonstrate the effectiveness of our proposed C$^2$GMoE.
Successful Page Load