Skip to yearly menu bar Skip to main content


Poster
in
Workshop: DMLR Workshop: Data-centric Machine Learning Research

Learning pipeline-invariant representation for robust brain phenotype prediction

Xinhui Li · Alex Fedorov · Mrinal Mathur · Anees Abrol · Gregory Kiar · Sergey Plis · Vince Calhoun


Abstract:

Deep learning has been widely applied in neuroimaging, including predicting brain-phenotype relationships from magnetic resonance imaging (MRI) volumes. MRI data usually requires extensive preprocessing prior to modeling but variation introduced by different MRI preprocessing pipelines may lead to different scientific findings, even when using identical data. Meanwhile, the machine learning community has emphasized the importance of shifting from model-centric to data-centric approaches considering the essential role of data quality in deep learning applications. Motivated by the recent data-centric perspective, we first evaluate how preprocessing pipeline selection can affect the downstream performance of a supervised learning model. We next propose two pipeline-invariant representation learning methodologies, MPSL and PXL, to improve robustness in classification performance and to capture similar neural network representations. Using a wide range of sample sizes from the UK Biobank dataset, we demonstrate that two models present common advantages, in particular that MPSL and PXL can be used to improve within-sample prediction performance and out-of-sample generalization. Both PXL and MPSL can learn more similar between-pipeline representations. These results suggest that our proposed models can be applied to mitigate pipeline-related biases, and to improve prediction robustness in brain-phenotype modeling.

Chat is not available.