Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Protein-protein interaction (PPI) network alignment is a canonical operation to transfer biological knowledge among species. The alignment of PPI-networks has many applications, such as the prediction of protein function, detection of conserved network motifs, and the reconstruction of species' phylogenetic relationships. A good multiple-network alignment (MNA), by considering the data related to several species, provides a deep understanding of biological networks and system-level cellular processes. With the massive amounts of available PPI data and the increasing number of known PPI networks, the problem of MNA is gaining more attention in the systems-biology studies. In this paper, we introduce a new scalable and accurate algorithm, called MPGM, for aligning multiple networks. The MPGM algorithm has two main steps: (i) SeedGeneration and (ii) MultiplePercolation. In the first step, to generate an initial set of seed tuples, the SeedGeneration algorithm uses only protein sequence similarities. In the second step, to align remaining unmatched nodes, the MultiplePercolation algorithm uses network structures and the seed tuples generated from the first step. We show that, with respect to different evaluation criteria, MPGM outperforms the other state-of-the-art algorithms. In addition, we guarantee the performance of MPGM under certain classes of network models. We introduce a sampling-based stochastic model for generating k correlated networks. We prove that for this model if a sufficient number of seed tuples are available, the MULTIPLEPERCOLATION algorithm correctly aligns almost all the nodes. Our theoretical results are supported by experimental evaluations over synthetic networks.
Anne-Florence Raphaëlle Bitbol, Nicola Dietler, Umberto Lupo
, , , , ,