Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
New drugs are needed to assure effective therapies for previously untreated diseases, emerging diseases, and personalized medicine, but the process of drug development is complex, costly, and time-consuming. This is especially problematic considering that 90% of drug candidates in clinical trials are discarded due to unexpected toxicity or other secondary effects. This inefficiency threatens our health care system and economy. Despite the advances in the cellular metabolism, our knowledge of the mechanisms governing enzymatic biotransformations in cells is far from complete, in particular regarding degradation pathways, mode of action, or side effects of drugs. Examining the mechanisms of enzymatic reactions at the cellular scale could improve our fundamental understanding of their catalytic capability, and facilitate identifying and filling the knowledge gaps. The scale and the complexity of metabolic data is ever-expanding, requiring scientists to apply more advanced computational methods to systematically store, explore, and interpret the enzymatic potential of cells. The first step toward simulating enzymes in silico is to learn from their biochemical reactions in nature. To do this, we use distilled knowledge of known biochemistry in the form of generalized enzymatic reaction rules. Enzymatic rules are mathematical representations of enzymatic action mimicking the catalytic function of enzymes. They are formulated in a less specific manner (more promiscuous) to act on a broad range of substrates. In addition to reconstructing known biochemistry, the application of these reaction rules paves the way toward the discovery of novel enzymatic interactions. In this thesis, I developed computational models, tools, and methodologies to facilitate the study of metabolism and catalytic action of enzymes. We analyzed different aspects of metabolism through five distinct studies: In a first study, in order to provide a holistic view of currently known biochemistry, we gathered biochemical data from 14 sources, covering the known metabolic networks of all species. We integrated all biological data into a high-performance database based on ontology, named LCSB DB. We further expanded the scope of LCSB DB to cover all bioactive and chemicals. LCSB DB offers fast and efficient searching of biochemical data and serves as a platform for sharing, storing, and analyzing biochemical data. In a second study, we used enzymatic reaction rules to predict all theoretically possible metabolic reactions between biological and bioactive compounds in LCSB DB. In a third study, we developed a method to find enzymes are able to catalyze orphan and predicted reactions, called BridgIT. BridgIT uses the knowledge of reactive sites on substrates to find the most similar, known biochemical reactions. We then validated the utility of BridgIT in enzyme discovery for the design of de novo synthetic pathways producing tetrahydropalmatine and adipic acid. In the last study, we propose a workflow for rational drug design and systems-level analysis of drug metabolism, called NICEdrug.ch. NICEdrug.ch allows large-scale computational analysis of drug biochemistry (metabolic precursors or prodrugs and metabolic fate or degradation), enzymatic targets, and toxicity in the context of cellular metabolism. Finally, in the conclusion chapter, we discuss the contribution and the potential further applications of the computational tools that were developed in this thesis.
Philippe Schwaller, Oliver Tobias Schilter, Andres Camilo Marulanda Bran, Carlo Baldassari