Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
With the advent of next-generation sequencing technologies, large data sets of several thousand loci from multiple conspecific individuals are available. Such data sets should make it possible to obtain accurate estimates of population genetic parameters, even for complex models of population history. In the analyses of large data sets, it is difficult to consider finite-sites mutation models (FSMs). Here, we use extensive simulations to demonstrate that the inclusion of FSMs is necessary to avoid severe biases in the estimation of the population mutation rate , population divergence times, and migration rates. We present a new version of Jaatha, an efficient composite-likelihood method for estimating demographic parameters from population genetic data and evaluate the usefulness of Jaatha in two biological examples. For the first application, we infer the speciation process of two wild tomato species, Solanum chilense and Solanum peruvianum. In our second application example, we demonstrate that Jaatha is readily applicable to NGS data by analyzing genome-wide data from two southern European populations of Arabidopsis thaliana. Jaatha is now freely available as an R package from the Comprehensive R Archive Network (CRAN).
Nicolas Jean Philippe Guex, Eléonore Lavanchy