Fast Mixture of Curvature-Aware Experts for Diverse and Dynamic Graph Topologies
Abstract
Dynamic graph learning, which focuses on modeling the merging, vanishing, and reconnection of nodes and edges, is crucial for real-world applications. In dynamic graphs, node neighborhoods often exhibit diverse and time-evolving topologies, including hierarchical, grid-like, and cyclic patterns. Existing methods typically embed graphs into a single curvature space, which limits the quality of node representations when the embedding geometry is not aligned well with the local graph topology. In this paper, we propose DyGMoCE, a Dynamic Graph Transformer with a Mixture of Curvature-aware Experts, which efficiently embeds each node at every timestamp into an adaptive curvature space. Specifically, DyGMoCE incorporates a mixture-of-experts framework to both the attention and feed-forward modules, where each expert operates on a Riemannian manifold with a distinct curvature. Then, motivated by the geometric continuity across the experts, we introduce a routing mechanism with a ranking constraint. To improve efficiency, we design a mathematically equivalent fast Riemannian attention module, achieving an average speedup of 26.3% and memory reduction of 52.0% for DyGMoCE. Notably, the fast Riemannian attention module is broadly applicable to Transformer models with sequence inputs. Extensive experimental results show that DyGMoCE significantly outperforms other state-of-the-art methods.