Poster Mon, Jul 6, 2026 • 10:00 PM – 11:45 PM PDT HALL A #3803

Structured Multi-modal Graph Disentanglement for Psychiatric Diagnosis

Hongyu Shi ⋅ Kaizhong Zheng ⋅ WS Zhai ⋅ Shuai Jiang ⋅ Badong Chen ⋅ Liangjun Chen

Abstract

Multi-modal neuroimaging-based psychiatric diagnosis must integrate cross-modal agreement with modality-specific complementarity, yet in real multi-site cohorts these signals are frequently entangled with site- and cohort-dependent correlations, yielding shortcut-driven predictions and limited interpretability. We propose Structured Multi-modal Graph Disentanglement (SMGD), which explicitly factorizes multi-modal graph representations into four components with distinct roles: shared diagnostic evidence, complementary diagnostic evidence, incidental cross-modal agreement, and modality-specific non-robust correlations, with the former two forming the diagnostic core and the latter two suppressed as shortcuts. SMGD is realized as geometry-driven structure learning: under a mild distributional assumption, we develop mini-batch estimable surrogate regularizers that shape subspace organization and cross-modal relations, enforcing semantic consistency through relational geometry rather than centroid coincidence while suppressing confounded dependencies. Experiments on large multi-site datasets show improved in-domain diagnosis and more reliable cross-dataset generalization in the presence of a Modality Gap, without relying on expert-crafted diagnostic biomarkers.

Lay Summary

This paper studies how to make psychiatric diagnosis from multi-modal brain imaging more reliable. Existing methods may mix true disease-related signals with dataset-specific shortcuts or noise. SMGD addresses this by separating multi-modal brain graph representations into diagnostic factors and shortcut-like factors, so that predictions rely more on meaningful brain patterns. Experiments on multi-site datasets show that this structured representation improves diagnosis and generalization across datasets.