Timezone: »
Over 80% of clinical genetics and omics data were collected from individuals of European ancestry (EA), which comprise approximately 16% of the world’s population. This severe data disadvantage for the non-EA populations is set to generate new health disparities as machine learning powered biomedical research and health care become increasingly common. The new health disparity arising from data inequality can potentially impact all data-disadvantaged ethnic groups in all diseases where data inequality exists. Thus, its negative impact is not limited to the diseases for which significant racial/ethnic disparities have already been evident. In a recent work, we showed that the current prevalent scheme for machine learning with multiethnic data, the mixture learning scheme, and its main alternative, the independent learning scheme, are prone to generating machine learning models with relatively low performance for data-disadvantaged ethnic groups due to inadequate training data and data distribution discrepancies among ethnic groups. We found that transfer learning can provide improved machine learning models for data-disadvantaged ethnic groups by leveraging knowledge learned from other groups having more abundant data. These results indicate that transfer learning can provide an effective approach to reduce health care disparities arising from data inequality among ethnic groups.
Author Information
Yan Gao (UTHSC)
Dr. Yan Gao is a Postdoc Research Fellow in the Department of Genetics, Genomics, and Informatics, where his research focuses on Machine Learning, Deep Learning, and Transfer Learning in healthcare. He earned a Ph.D. degree in Computer Engineering with a focus on machine learning, data mining, and smart housing.
Yan Gao (UTHSC)
Dr. Yan Gao is a Postdoc Research Fellow in the Department of Genetics, Genomics, and Informatics, where his research focuses on Machine Learning, Deep Learning, and Transfer Learning in healthcare. He earned a Ph.D. degree in Computer Engineering with a focus on machine learning, data mining, and smart housing.
More from the Same Authors
-
2021 : Highlight 6 | Data Inequality, Machine Learning and Health Disparity »
Workshop CompBio · Yan Gao