Improving the sensitivity of the sequence profile method

The sequence profile method (Gribskov M, McLachlan AD, Eisenberg D, 1987, Proc Natl Acad Sci USA 84:4355-4358) is a powerful tool to detect distant relationships between amino acid sequences. A profile is a table of position-specific scores and gap penalties, providing a generalized description of a protein motif, which can be used for sequence alignments and database searches instead of an individual sequence. A sequence profile is derived from a multiple sequence alignment. We have found 2 ways to improve the sensitivity of sequence profiles: (1) Sequence weights: Usage of individual weights for each sequence avoids bias toward closely related sequences. These weights are automatically assigned based on the distance of the sequences using a published procedure (Sibbald PR, Argos P, 1990, J Mol Biol 216:813-818). (2) Amino acid substitution table: In addition to the alignment, the construction of a profile also needs an amino acid substitution table. We have found that in some cases a new table, the BLOSUM45 table (Henikoff S, Henikoff JG, 1992, Proc Natl Acad Sci USA 89:10915-10919), is more sensitive than the original Dayhoff table or the modified Dayhoff table used in the current implementation. Profiles derived by the improved method are more sensitive and selective in a number of cases where previous methods have failed to completely separate true members from false positives.

Improving the sensitivity of the sequence profile method

Graph Chatbot

Chat with Graph Search

Investigating the intra-molecular and inter-molecular effects of post-translational modifications on intrinsically disordered protein regions and structured protein regions

Towards improving full-length ribosome density prediction by bridging sequence and graph-based representations

Opportunities and challenges in design and optimization of protein function

Investigating the intra-molecular and inter-molecular effects of post-translational modifications on intrinsically disordered protein regions and structured protein regions

Towards improving full-length ribosome density prediction by bridging sequence and graph-based representations

Opportunities and challenges in design and optimization of protein function