Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
Identification of genes underlying human diseases is an important step in understanding and treating genetic disorders. Based on the assumption that related diseases are caused by related genes, several methods for candidate gene prioritization have been proposed in the past to refine lists of suspect genes obtained by linkage analysis or other methods. The large increase in publicly available -omics data has made it possible to implement prioritization methods that combine information from multiple data sources to make better rankings. In this work, we present a new method for prioritization of candidate disease genes based on gene expression data, that ranks 12851 genes for 5080 phenotypes. The performance is comparable to previous methods which used hand-curated protein-protein data on smaller test sets. We also propose a method for combining multiple gene networks into a single one with which we ranked up to 14612 genes for 5080 phenotypes, more than any previous method. Our evaluation shows, that the performance of the fused network is superior to that of its separate component networks.
Didier Trono, Evaristo Jose Planet Letschert, Julien Léonard Duc, Alexandre Coudray, Julien Paul André Pontis, Delphine Yvette L Grun, Cyril David Son-Tuyên Pulver, Shaoline Sheppard