Skip to yearly menu bar Skip to main content


Poster

Vector-quantized Masked Auto-encoders on Molecular Surfaces

Fang Wu · Stan Z Li


Abstract:

Molecular surfaces imply fingerprints of interaction patterns between proteins. However, non-equivalent efforts have been paid to incorporating the abundant protein surface information for analyzing proteins' biological functions in juxtaposition to amino acid sequences and 3D structures. To overcome this obstacle, we propose a novel surface-based unsupervised learning algorithm termed Surface-VQMAE. In light of the sparsity and disorder properties of surface point clouds, we first partition them into patches and obtain the sequential arrangement via the Morton curve. Successively, a Transformer-based architecture named SurfFormer is introduced to integrate the surface geometry and capture patch-level relations. At last, we enhance the prevalent masked auto-encoder (MAE) with the vector quantization (VQ) technique, which establishes a surface pattern codebook to enforce a discrete posterior distribution of latent variables and achieve more condensed semantics. Our work is the foremost to implement pretraining purely on molecular surfaces and extensive experiments on diverse real-life scenarios including binding site recognition, binding affinity prediction, and mutant effect estimation demonstrate its effectiveness.

Live content is unavailable. Log in and register to view live content