Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Our genome is a long sequence of DNA that contains all the information to be able to constitute a living organism like us, similarly to what the letters in a book do to create a story. This sequence, which is a stretch of molecules called nucleotides, is almost identical between organisms of the same species (>99.9% in humans), however it varies in a small percentage due to some nucleotides being different at specific positions of the genome. This variation in the DNA will therefore influence our traits and affect our susceptibility to a certain condition. Thus, their study is highly relevant in genetics and biomedicine since they allow not only to predict the predisposition to a disease, such as cancer, but also to understand the molecular and cellular mechanism behind a certain phenotype. The effect of a genetic variant on a phenotype is given by their capacity to modulate the expression of a gene, either by directly impacting its sequence and thus changing the protein, or by regulating its expression levels. Interestingly, more than 88% of the disease-associated genetic variants present in humans are located outside genes, on the so-called non-coding part of the genome. This poses a challenge to identify what gene the genetic variant affects and to understand how it does so. Genetic variants do not only influence gene expression but they also affect the epigenetics state of the DNA (i.e. DNA methylation, histone marks, transcription factor (TF) binding...) and its 3D conformation, consequently modifying the activity of gene regulatory elements like promoters and enhancers. Previously, it has been shown that some of these regulatory regions located in the same locus exhibit a high level of molecular coordination and interindividual variation, mostly affected by genetic variation or environmental effects. These regions were termed variable chromatin modules (VCMs) or cis-regulatory domains (CRDs) and are thought to be the manifestation of fine-grained regulatory units of the genome. Understanding how a VCM is formed and the molecular basis behind the cooperativity between regulatory elements is an outstanding question in the field. In this work, we aim at resolving part of this puzzle by mechanistically dissecting one VCM that is influenced by a non-coding germline variant in B cells. Using a set of tools such as genetic engineering, TF binding assays and chromosome conformation studies, we demonstrate that this variant creates a de novo binding site for the TF MEF2, which acts as a pioneering factor for dozens of other collaborating TFs. In turn, this actuates a cluster of enhancers spanning a region over 150 kilobases (kb) that regulates AXIN2 gene expression, and is associated with a global compaction of the chromatin. In addition, in line with the known tumor-suppressor properties of AXIN2, we found that this variant influences chronic lymphocytic leukemia (CLL) progression and predisposition. Together, the characterization of the AXIN2 VCM provides an unique example to the community of how a germline variant is able to switch on an entire genomic locus with high degree of cooperativity between regulatory elements, since it captures both the formation and transition behavior of a VCM. Furthermore, given the link with CLL, this focal variant could have translational potential by serving as a patient-stratifying or treatment diagnostic.