Reconstruction of Large-Scale Phylogenies

Vaibhav Rajan
2012
EPFL thesis

Abstract

The evolutionary relationships between organisms or phylogenies are fundamental to biology. They are invaluable as guiding tools to mine, organize and exploit the enormous amounts of biological data in the post-genomic era. The advent of high-throughput sequencing has resulted in whole genome data of many organisms and the need for inferring larger phylogenies ; both have necessitated the development of new methods and models in the field of phylogenetic reconstruction. The work presented in this dissertation contributes to this development. Phylogenies are most often inferred from genomic sequences – DNA or amino acid sequences. A key step before using a phylogenetic reconstruction method is that of aligning the input sequences. Finding accurate multiple sequence alignments is hard because of the heterogeneity of evolutionary signal present in the sequences—a more acute problem in whole genome sequences. A new approach to refining the phylogenetic signal of alignments is presented that identifies and eliminates phylogenetically noisy parts of the alignment in order to yield better phylogenies. The more recent model of evolution based on rearrangements (large-scale structural changes in genomes) has been a major step towards reconstructing large phylogenies. The design of models and methods in this area has presented significant mathematical challenges and some of these algorithmic and statistical questions are addressed in this work. The efficacy of simple distance-based methods are demonstrated in building accurate trees with rearrangement data, using precise estimates of true evolutionary distances. Novel methods methods are designed for robustness assessment of trees inferred by such distance-based methods. These methods are the first methods for robustness assessment for trees inferred from rearrangement data that are on par with previous such measures for trees inferred from sequence data. Further, two algorithmic problems on inversions are discussed : sorting by inversions and the inversion median problem. New algorithms and theoretical insights about the structure of the problems are described.

Official source

https://infoscience.epfl.ch/record/176958?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Reconstruction of Large-Scale Phylogenies

Graph Chatbot

Chat with Graph Search

The microbial genomics of glacier-fed streams: adaptations to an extreme ecosystem

From trees to barcodes and back again II: Combinatorial and probabilistic aspects of a topological inverse problem

Impact of phylogeny on structural contact inference from protein sequence data

The microbial genomics of glacier-fed streams: adaptations to an extreme ecosystem

From trees to barcodes and back again II: Combinatorial and probabilistic aspects of a topological inverse problem

Impact of phylogeny on structural contact inference from protein sequence data