Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The sheer size of the protein sequence space is massive: a protein of 100 residues can have 20^100 possible sequence combinations; and knowing that this exceeds the number of atoms in the universe, the chance of randomly discovering a stable new sequence with the desired characteristics is infinitesimally small. Therefore, computational methodologies that can search through the sequence space and expand beyond naturally occurring functional protein sequence variants hold enormous potential in biomedicine and nanotechnology.My thesis work leverages machine learning, physics-based, and data-driven techniques to design new protein molecules with distinct shapes (folds) so that they can precisely interact with other molecules to perform biological functions.The first part of my thesis is dedicated to the design of functional proteins. The re-designed of an anti-CRISPR protein that can be controlled via blue light (optogenetic control) to regulate the genome editing activity of the enzyme CRISPRâCas9 is presented. A surface-based design of a broad-spectrum inhibitory Acr towards another natural target SauCas9) exemplifies the re-purposing of existing inhibitory molecules against other related targets. This ultimately led to the development of a general surface-centric design method for generating specific protein-protein interactions from scratch and exemplified by the successful design of novel PD-L1 inhibitors, an immune checkpoint that can halt the immune system from attacking the cancer cells.The successful design of protein-protein interactions heavily relies on the underlying protein fold and structure stabilizing the functional motif in a protein. Because nature has only evolved a small set of protein folds, generated protein-interaction motifs can rarely be incorporated into existing protein structures. To address this problem, the second part of my thesis is dedicated to the development of computational de novo protein design methods for the crafting of proteins with customized folds. To this end, the TopoBuilder framework utilizes a large collection of native proteins to transform a literal description of a protein fold into a physically-realistic protein. Finally, Genesis, a deep neural networks approach for the tailored de novo protein design is presented. Employing both, the TopoBuilder and Genesis, proteins completely absent from the natural repertoire were designed and experimentally validated.My thesis sets the path to explore possibilities of jointly optimizing the protein's shape and its surface geometry to master biological functions. We are now at entering a new era where newly designed protein-based drugs and materials with the potential to solve a vast array of technical challenges and open new avenues for next-generation precision drugs and advanced nanomaterials.
,