Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Accessible and Efficient Foundation Models for Biological Discovery

Multi-Task Training Increases Native Sequence Recovery of Antigen-Specific T-cell Receptor Sequences

Dhuvarakesh Karthikeyan · Alex Rubinsteyn

Keywords: [ Computational Immunology ] [ large language models ] [ seq2seq ] [ Machine Translation ]


Abstract:

T-cells are a critical component of the adaptiveimmune system that use T-cell receptors (TCRs)to bind highly specific non-self peptide fragmentspresented by major histocompatibility complex(MHC) molecules on the surface of other cells.Given their importance, a foundation model ofTCR specificity that is capable of reliably mapping between TCR sequences and their cognatepeptide-MHC (pMHC) ligands remains an unmet need. This study presents a key step towardsdeveloping a comprehensive foundation modelby exploring the bi-directional mapping of bothpMHCs to their corresponding TCRs, and viceversa. While validation performance was significantly worse in the TCR to pMHC direction giventhe highly asymmetric distribution of pMHC data,we find that the bidirectionally trained model outperformed the model trained in a single pMHCto TCR direction. We present our findings as apotential direction towards a unified generativefoundation model of TCR:pMHC cross-reactivity.

Chat is not available.