Workshop Poster
in
Workshop: ICML 2021 Workshop on Computational Biology

Representation learning of genomic sequence motifs via information maximization

Nicholas Lee

[ Visit Poster at Spot B3 in Virtual World ]

Abstract

Convolutional neural networks (CNNs) trained to predict regulatory functions from genomic sequence often learn partial or distributed representations of sequence motifs across many first-layer filters, making it challenging to interpret the biological relevance of these models’ learned features. Here we present Genomic Representations with Information Maximization (GRIM), an unsupervised learning method based on the Infomax principle that enables more comprehensive identification of whole sequence motifs learned by CNNs. By performing systematic experiments, we empirically demonstrate that GRIM is able to discover motifs in genomic sequences in situations where supervised learning struggles.

Chat is not available.