Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Despite the establishment of design principles to optimize codon choice for heterologous expression vector design, the relationship between codon sequence and final protein yield remains poorly understood. In this work, we present a computational framework for the identification of a set of mutant codon sequences for optimized heterologous protein production, which uses a codon-sequence mechanistic model of protein synthesis. Through a sensitivity analysis on the optimal steady state configuration of protein synthesis we are able to identify the set of codons, that are the most rate limiting with respect to steady state protein synthesis rate, and we replace them with synonymous codons recognized by charged tRNAs more efficient for translation, so that the resulting codon-elongation rate is higher. Repeating this procedure, we iteratively optimize the codon sequence for higher protein synthesis rate taking into account multiple constraints of various types. We determine a small set of optimized synonymous codon sequences that are very close to each other in sequence space, but they have an impact on properties such as ribosomal utilization or secondary structure. This limited number of sequences can then be offered for further experimental study. Overall, the proposed method is very valuable in understanding the effects of the different properties of mRNA sequences on the final protein yield in heterologous protein production and it can find applications in synthetic biology and biotechnology.
Anne-Florence Raphaëlle Bitbol, Damiano Sgarbossa, Umberto Lupo
Bruno Emanuel Ferreira De Sousa Correia, Michael Bronstein, Hamed Khakzad, Casper Alexander Goverde, Arne Schneuing, Ilia Igashov
Pierre Vandergheynst, Felix Naef, Cédric Gobet, Francesco Craighero, Mohan Vamsi Nallapareddy