ICML Deep neural networks identify sequence context features predictive of transcription factor binding

Workshop Poster
in
Workshop: ICML 2021 Workshop on Computational Biology

Deep neural networks identify sequence context features predictive of transcription factor binding

AN ZHENG

[ Abstract ]

[ Visit Poster at Spot A3 in Virtual World ]

Abstract:

Transcription factors bind DNA by recognizing specific sequence motifs, which are typically 6–12 bp long. A motif can occur many thousands of times in the human genome, but only a subset of those sites are actually bound. Here we present a machine-learning framework leveraging existing convolutional neural network architectures and model interpretation techniques to identify and interpret sequence context features most important for predicting whether a particular motif instance will be bound. We apply our framework to predict binding at motifs for 38 transcription factors in a lymphoblastoid cell line, score the importance of context sequences at base-pair resolution and characterize context features most predictive of binding. We find that the choice of training data heavily influences classification accuracy and the relative importance of features such as open chromatin. Overall, our framework enables novel insights into features predictive of transcription factor binding and is likely to inform future deep learning applications to interpret non-coding genetic variants.

Chat is not available.

Workshop Poster in Workshop: ICML 2021 Workshop on Computational Biology

Deep neural networks identify sequence context features predictive of transcription factor binding

AN ZHENG

Workshop Poster
in
Workshop: ICML 2021 Workshop on Computational Biology