Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Machine Learning for Multimodal Healthcare Data

Death Prediction by Race in Colorectal Cancer Patients Using Machine Learning Approaches

Abiel Roche-Lima · Frances Aponte · Frances Heredia Negron · Brenda Nieves-Rodriguez

Keywords: [ Computational Biology ] [ Electronic healthcare records ] [ Multimodal biomarkers ] [ Benchmarking, domain shifts, and generalization ]


Abstract:

Cancer (CRC) cases have increased worldwide. In USA, African Americans have a higher incidence than other races. In this paper, we aimed to use ML to study specific factors or variables affecting the high incidence of CRC mortality by race after receiving treatments and create models to predict death. We used metastatic CRC Genes Sequencing Studies as data. The patient’s inclusion was based on receiving chemotherapy and grouped by race (White-American and African-American). Five supervised ML methods were implemented for creating model predictions and a Mini-Batched-Normalized-Mutual-Information-Hybrid-Feature-Selection method to extract features including more than 25,000 genes. As a result, the best model was obtained with the Classification-Regression-Trees algorithm (AUC-ROC= 0.91 for White-American, AUC-ROC=0.89 for African Americans). The features "DBNL gene", "PIN1P1 gene" and "Days-from-birth" were the most significant variables associated with CRC mortality for White-American, while "IFI44L-gene", "ART4-gene" and "Sex" were the most relevant related to African-American. In conclusion, these features and models are promising for further analysis and decision-making tools to study CRC from a precision medicine perspective for minority health.

Chat is not available.