MAST: Motif-Augmented Diffusion with Search Tree for Spectroscopic Molecular Structure Elucidation
Abstract
Elucidating molecular structures from spectra is a foundational problem in chemical and materials characterization, yet remains challenging due to spectral ambiguity and the vast molecular space. Although recent diffusion-based generators show strong promise for spectra-conditioned elucidation, existing methods struggle to learn robust spectra-structure relationships from limited paired data when relying solely on global spectral representation. Moreover, the repeated full sampling inference strategy incurs substantial computation overhead. To address these limitations, we propose MAST, a Motif-Augmented diffusion framework with Search Tree, for joint 2D-3D spectroscopic molecular structure elucidation. MAST introduces explicit, interpretable motif priors as intermediate evidences throughout denoising, reducing conditional ambiguity and facilitating spectra-conditioned optimization. We further cast diffusion sampling as reward-guided tree search to prioritize high-reward denoising trajectories, yielding a compact set of spectra-consistent candidates under limited budgets. On the QM9S multi-spectra benchmark, MAST achieves 94.89% exact recovery and improves 3D fidelity, while preserving high chemical validity and stability.