ProMiSE: Protein Multi-state Structure Evaluation Benchmark in Biological Contexts
Abstract
Proteins are inherently dynamic, with biological functions often emerging from transitions between multiple conformational states. While recent breakthroughs have largely addressed the static structure prediction problem, a systematic benchmark is absent to demonstrate how well current models capture functionally relevant dynamics. We introduce ProMiSE, the first benchmark that provides both a dataset and evaluation scheme, based on native biological assemblies and integrating major conformational change mechanisms—intrinsic, ligand-induced, and protein-induced—within a single curated dataset. We conducted a comprehensive evaluation of state-of-the-art structure prediction models, including AlphaFold3 and recent generative approaches. Our findings reveal that current models exhibit a limited ability to sample intrinsic dynamics and are often insensitive to biological context in induced scenarios. We further investigate whether these multi-state prediction biases are associated with multiple sequence alignment (MSA) signals or training data distributions, while analyzing internal model representations throughout the model to identify where these biases arise. Ultimately, ProMiSE benchmarks limitations in conformational diversity and biological relevance, enabling improved multi-state and dynamics-aware modeling.