Scaling the Prior: Size-Consistent Geometric Diffusion for 3D Molecular Generation
Abstract
Diffusion models typically operate in fixed-dimensional metric spaces, whereas geometric molecular data vary in dimensionality because molecules differ in size (number of atoms). A common adaptation in diffusion models for geometric molecular generation is to use architectures that handle variable-sized inputs, such as graph neural networks and transformers. However, these approaches ignore that molecular size also sets the spatial scale of atomic coordinates, which induces inconsistent generative trajectories across sizes. In 3D molecular diffusion, generation can be seen as forming a coarse structure first and then refining atomic positions. Larger molecules form coarse structures earlier than smaller ones because their spatial scales are larger relative to the noise. This makes the generative process inconsistent across sizes, with trajectories driven by molecular size rather than by a unified generative pattern. We are the first to identify and analyze this size-induced inconsistency by decomposing denoising dynamics, showing how spatial scale shapes formation of both 3D structure and atom types. Based on this, we propose Scaling the Prior (StP), which rescales the prior distribution by molecular size to normalize learning and generation across sizes, harmonize denoising trajectories, and enable consistently high-quality molecules.