Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
CD4+T cells orchestrate the adaptive immune response against pathogens and cancer by recognizing epitopes presented on class II major histocompatibility complex (MHC-II) molecules. The high polymorphism of MHC-II genes represents an important hurdle toward accurate prediction and identification of CD4+ T cell epitopes. Here we collected and curated a dataset of 627,013 unique MHC-II ligands identified by mass spectrometry. This enabled us to precisely determine the binding motifs of 88 MHC-II alleles across humans, mice, cattle, and chickens. Analysis of these binding specificities combined with X-ray crystallography refined our under-standing of the molecular determinants of MHC-II motifs and revealed a widespread reverse-binding mode in HLA-DP ligands. We then developed a machine-learning framework to accurately predict binding specific-ities and ligands of any MHC-II allele. This tool improves and expands predictions of CD4+ T cell epitopes and enables us to discover viral and bacterial epitopes following the aforementioned reverse-binding mode.
,