Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
There are 377 Krüppel-associated box (KRAB) domain-containing zinc finger proteins (KZFPs) in the human genome, making them the largest family of transcription factors. KZFPs are defined by a N-terminal KRAB domain and several zinc-finger domains arranged in an array at the C-terminus of the proteins. The zinc-finger domains each form sequence specific interactions with double stranded DNA, allowing the zinc-finger array to target specific genomic sequences. The KRAB domain, through its interaction with the protein TRIM28 (also known as KAP1) allows for the stable silencing of transcription in a genomic region. Together these two domains allow KZFPs to bind specific regions of the genome and generally lead to the formation of heterochromatin and silencing of transcription allthough other functions for some KZFPs have been recently reported. KZFPs usually target transposable elements (TEs), mobile genomic elements that can move through the genome either by cut-and-paste or copy-paste mechanisms and make up almost half of the human genome. Different KZFPs bind to specific regions of TEs, limiting their expression and thus mitigating some of the threat they pose to human development, allowing both the TE and its host to survive. In recent years it became apparent that both certain TEs and KZFPs have roles beyond the previously described threat and mitigation of that threat. TEs harbor gene regulatory regions allowing, through their spread, for a dissemination of those regions through the genome and for new gene regulatory networks to form. KZFPs can affect the transcription of genes located in proximity to their binding sites leading to changes in gene regulation as well. A multitude of regulatory roles involving KZFPs and TEs have been described in recent years making them an exciting field of study. A major hurdle in understanding both KZFPs and TEs is to identify the binding sites of KZFPs, as they reveal both the targets of the KZFP and how, if at all, a TE is regulated by KZFPs. To do so, we expanded on previous efforts and aimed to perform chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-seq) on every member of the human KZFP family. Here we present ChIP-seq experiments for 110 new KZFPs, which together with previously published data, allow us to study the binding of almost all human KZFPs (95%). The entirety of the data was analysed together to generate a coherent dataset and results for each KZFP are made available to the public on our web portal (https://tronoapps.epfl.ch/web/krabopedia/). The identified binding sites allowed us to witness the adaptation of the KZFP family to new TEs and showed how targeted sequences shift after segmental gene duplication events. Furthermore, we could corroborate that several KZFPs target the same TE subfamilies in a seemingly redundant fashion and show that unexpectedly these KZFPs arose independently in different genomic locations. These results represent a valuable tool for anyone studying KZFPs and should aid in the pursuit of both understanding KZFPs and TEs.
Didier Trono, Evaristo Jose Planet Letschert, Wayo Matsushima