Skip to yearly menu bar Skip to main content

Workshop Poster
Workshop: ICML 2021 Workshop on Computational Biology

Prediction of RNA-protein Interactions Using a Nucleotide Language Model

Keisuke Yamada


The accumulation of sequencing data has enabled researchers to predict the interactions between RNA sequences and RNA-binding proteins (RBPs) using novel machine learning techniques. However, existing models are often difficult to interpret and require additional information to sequences. Bidirectional encoder representations from Transformer (BERT) is a language-based deep learning model that is highly interpretable; therefore, a model based on BERT architecture can potentially overcome such limitations. Here, we propose BERT-RBP as a model to predict RNA-RBP interactions by adapting the BERT architecture pre-trained on a human reference genome. Our model outperformed state-of-the-art prediction models using the eCLIP-seq data of 154 RBPs. The detailed analysis further revealed that BERT-RBP could recognize the transcript region type only from sequential information. Overall, the results provide insights into the mechanism of BERT in biological contexts and provide evidence of the applicability of the model to other RNA-related problems.

Chat is not available.