Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Evolution can be described as the change of allele frequencies over time. Four forces - mutation, migration, genetic drift, and selection, drive this change. The aim of my thesis was to accurately estimate and differentiate the parameters governing each of these four mechanisms by utilizing various types of Next-Generation Sequencing datasets. More specifically, in chapters 1 and 2, I focused on investigating how the past demographic history of African and European D. melanogaster affected its genomic polymorphism. Modern genomes of flies carry signatures of past events such as migration to new regions, adaptation to new environments, and population size changes. By studying whole genome sequences of 29 wild strains from West Africa, 14 from Sweden and comparing them with genomes from Zambia (the putative ancestral range of the species), we were able to report for the first time, colonization time of the Western part of the African continent at approximately 72k years ago. Additionally, we demonstrated the importance of gene flow between the two populations, as well as, current and past effective population sizes. Our estimations confirmed already published predictions (Current Zambian and Swedish population size, ancestral African population size). Finally, we demonstrated the importance of inversions when accounting for demographic events of D. melanogaster. In chapter 3 of my thesis, I evaluated the importance of selection acting on the DNA-binding residues of the biggest family of transcription factors in the primates, namely KRAB-ZF genes. We were able to demonstrate the existence of two distinct sub-groups, based on the type of polymorphism (synonymous or not) carried by the DNA-contacting nucleotides. The two groups of genes differ by their expression breadth and intensity, as well as at the number of paralogs and orthologs and their evolutionary age. Additionally, we manually annotated the complete catalog of human KRAB-ZF genes, thereby providing a valuable resource for further investigation of this family of genes. In conclusion, the work carried during my thesis enabled to refine the evolution and demography of D. melanogaster African and Northern European populations, underlying the importance of modelling migration flows between populations for accurate estimation of split time. The second component of my thesis demonstrated the applicability of transcriptomics and epigenomics datasets to study evolution of the KRAB-ZF family. The proposed methodologies are applicable to other transcription factor gene families and our manually curated dataset is relevant to other scientists deciphering the function of these genes.
Anne-Florence Raphaëlle Bitbol, Richard Marie Servajean
Melanie Blokesch, Sandrine Stutzmann, Alexandre Lemopoulos, Natalia Carolina Drebes Dorr