Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.
Mass spectrometry (MS) has emerged over the last two decades as the analytical technique of choice in systems-level protein studies, known as proteomics. Two are the MS-based approaches generally applied to proteomics: bottom-up (BU), which relies on the proteolytic digestion of proteins into short (~10 amino acids) peptides, and top-down (TD), where proteolysis is omitted, intact proteins are detected and fragmented in gas phase. Both methods present advantages as well as drawbacks. Here, we sought to establish a complete platform to put forward a third MS-based proteomic approach; middle-down (MD). It implies protein digestion as in BU, but aims to generate large peptides which size approaches the one of small intact proteins that are readily analyzed in TD. This novel domain aims to account for the shortcoming of both classical approaches. Until now, the main reasons behind the limited use of MD proteomics have been the lack of easy-to-use cleaving agents capable of producing peptides in the desired 3-15 kDa mass range with high specificity, limitations in MS and allied instrumentation, and the absence of dedicated bioinformatics tools for processing of acquired data. The latter greatly impedes the next milestone in MD proteomics â large-scale analysis. MD can potentially combine the analysis of large portions of proteins carrying set of biologically-relevant modifications â allowing exploring proteins of molecular weight or complexity incompatible with current TD capabilities â with the high-throughput hallmark of BU proteomics. Here, we first in silico evaluated the potential target amino acid residues to produce peptides in the MD mass range within the proteomes of different organisms. This bioinformatics work was followed by an experimental study based on synthetic MD-sized peptides, aimed at determining the optimal MS and tandem MS parameters for large peptide characterization. Next, we pursued two distinct ways of performing MD proteolysis: i) the use of an enzyme and ii) the use of a chemical reagent. We selected the protease Sap9 as a target enzyme for MD, which we fully characterized and successfully applied to the study of a mixture of monoclonal antibodies, where it showed an advantage over traditional BU in terms of reduced introduction of artifacts to the sample, allowing the post-translational modification investigation and unambiguous antibody identification. The chemical cleavage way we addressed via judicious protocol optimization for hydrolysis at the N-terminal side of cysteine with NTCB reagent. We also advanced MD protocols by generation of large (~50 kDa) subunits of monoclonal antibodies through the use of papain and another more specific novel protease, GingisKHAN, combined with new MS signal processing and data analysis capabilities. The developed workflow improved mapping of the connectivity of cysteines involved in inter- and intra-molecular disulfide bridges in antibodies. To summarize, we demonstrated that MD approach to mass spectrometry and proteomics is a powerful, yet underdeveloped, complement to BU and TD. This work has benchmarked MD for targeted protein analysis. In the near future, with advancements of the field, we envision its growing use for large-scale complex proteome analysis.
Nako Nakatsuka, Xinyu Zhang, Haiying Hu
Alexandra Krina Van Hall-Beauvais