Workshop Poster
in
Workshop: ICML 2021 Workshop on Computational Biology
MultiMAP: Dimensionality Reduction and Integration of Multimodal Data
Mika Jain
Multimodal data is rapidly growing in single-cell biology and other fields of science and engineering. We introduce MultiMAP, an approach for dimensionality reduction and integration of multiple datasets. MultiMAP is a nonlinear manifold learning technique that recovers a single manifold on which all datasets reside and then projects the data into a single low-dimensional space so as to preserve the manifold structure. MultiMAP has several advantages over existing integration strategies for single-cell data, including that it can integrate any number of datasets, leverages features that are not present in all datasets (i.e. datasets can be of different dimensionalities), is not restricted to a linear mapping, allows the user to specify the influence of each dataset on the embedding, and is extremely scalable to large datasets. We apply MultiMAP to the integration of a variety of single-cell transcriptomics, chromatin accessibility, methylation, and spatial data, and show that it outperforms current approaches in preservation of high-dimensional structure, alignment of datasets, visual separation of clusters, transfer learning, and runtime.