Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Proteins control nearly every facet of life on a molecular level. Proteins are formed from linear strings of amino acids, which fold into three-dimensional structures that can enact functions. Evolution has created highly efficient proteins in diverse folding patterns with a variety of functions. Despite this diversity, nature has sampled only a limited portion of the potential sequence/fold space, leaving a vast space that could contain novel folds or potential functions. Use of proteins in biotechnological and therapeutic fields has advanced in the past few decades, and with it the desire to design novel proteins with a given structure or function. Modern de novo design methods strive to design new proteins that fold into novel structures and/or have novel functions, be it enzymatic activity or binding to a target. Despite rapid advancements, many challenges still exist in the field. Previous efforts to design diverse backbones that fold into specific structures have faced difficulties. The design of novel binding partners for a specific target with no known binders can be challenging, and many methods face extremely low success rates and low affinity outcomes. The projects described in this thesis seek to address these limitations and expand the ability of protein engineers to design proteins.We developed Genesis, a program that allows for the prediction of protein sequence from a desired structure. We utilized Genesis in conjunction with trRosetta to design sequences that fold into native (existing in nature) folds and tested these using a high throughput assay. We then designed darkfolds, novel folds that have not been seen before in nature that could be given novel functions. We tested these folds for stability to proteolytic cleavage and produced the most stable designs to test biochemically. We were able to produce several darkfold proteins and found them to be folded and stable. Additionally, analysis of the high throughput assay allowed us to draw conclusions from the design process.Beyond structure, proteins are endowed with function. These functions require interaction with DNA, lipids, small molecules, and importantly, other proteins. Protein-protein interactions (PPIs) can be transient low-affinity interactions or long-lasting high-affinity partnerships. PPIs form by two complementary surfaces, generally hydrophobic patches, coming together and excluding water molecules from the buried interface. Dysregulation of certain PPIs perturbs cellular homeostasis and can lead to disease. Conversely, controlling specific PPIs can be useful for therapeutics and in cellular assays. MaSIF surface fingerprinting software utilizes deep learning to find complementary surface patches to a given target site of interest. In this thesis, MaSIF is used to design protein binders for therapeutically relevant targets. The binders were then optimized and validated experimentally. This project netted four site-specific protein binders: one for SARS-CoV-2 spike protein that was found to be neutralizing in cellular assays, two for PD-L1 and one for PD-1. Structures of the design bound to the target were solved for three of the four binders, highlighting the success of the method. This thesis addresses the challenge of designing de novo proteins with novel folds and designing protein binding partners to specific sites. These projects will allow for additional capability for protein engineers to be able to create functional novel proteins.
Bruno Emanuel Ferreira De Sousa Correia, Casper Alexander Goverde