Threading (protein sequence)

In molecular biology, protein threading, also known as fold recognition, is a method of protein modeling which is used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure. It differs from the homology modeling method of structure prediction as it (protein threading) is used for proteins which do not have their homologous protein structures deposited in the Protein Data Bank (PDB), whereas homology modeling is used for those proteins which do. Threading works by using statistical knowledge of the relationship between the structures deposited in the PDB and the sequence of the protein which one wishes to model. The prediction is made by "threading" (i.e. placing, aligning) each amino acid in the target sequence to a position in the template structure, and evaluating how well the target fits the template. After the best-fit template is selected, the structural model of the sequence is built based on the alignment with the chosen template. Protein threading is based on two basic observations: that the number of different folds in nature is fairly small (approximately 1300); and that 90% of the new structures submitted to the PDB in the past three years have similar structural folds to ones already in the PDB. The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive description of the structural and evolutionary relationships of known structure. Proteins are classified to reflect both structural and evolutionary relatedness. Many levels exist in the hierarchy, but the principal levels are family, superfamily, and fold: Family (clear evolutionary relationship): Proteins clustered together into families are clearly evolutionarily related. Generally, this means that pairwise residue identities between the proteins are 30% and greater. However, in some cases similar functions and structures provide definitive evidence of common descent in the absence of high sequence identity; for example, many globins form a family though some members have sequence identities of only 15%.

Opportunities and challenges in design and optimization of protein function

Bruno Emanuel Ferreira De Sousa Correia, Casper Alexander Goverde

The field of protein design has made remarkable progress over the past decade. Historically, the low reliability of purely structure-based design methods limited their application, but recent strategies that combine structure-based and sequence-based calcu ...

Nature Portfolio2024

Impact of phylogeny on structural contact inference from protein sequence data

Anne-Florence Raphaëlle Bitbol, Umberto Lupo, Nicola Dietler

Local and global inference methods have been developed to infer structural contacts from multiple sequence alignments of homologous proteins. They rely on correlations in amino acid usage at contacting sites. Because homologous proteins share a common ance ...

ROYAL SOC2023

Prolamins' 3D structure: A new insight into protein modeling using the language of numbers and shapes

Opportunities and challenges in design and optimization of protein function

Impact of phylogeny on structural contact inference from protein sequence data

Graph Chatbot

Chat with Graph Search

Opportunities and challenges in design and optimization of protein function

Prolamins' 3D structure: A new insight into protein modeling using the language of numbers and shapes

Impact of phylogeny on structural contact inference from protein sequence data