Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
3
pubmed:dateCreated
1993-9-22
pubmed:abstractText
The problem of protein structure prediction is formulated here as that of evaluating how well an amino acid sequence fits a hypothetical structure. The simplest and most complicated approaches, secondary structure prediction and all-atom free energy calculations, can be viewed as sequence-structure fitness problems. Here, an approach of intermediate complexity is described, which involves; (1) description of a protein structure in terms of contact interface vectors, with both intra-protein and protein-solvent contacts counted, (2) derivation of sequence preferences for 2 up to 29 contact interface types, (3) generation of numerous hypothetical model structures by placing the input sequence into a large set of known three-dimensional structures in all possible alignments, (4) evaluation of these models by summing the sequence preferences over all structural positions and (5) choice of predicted three-dimensional structure as that with the best sequence-structure fitness. Evolutionary information is incorporated by using position-dependent core weights derived from multiple sequence alignments. A number of tests of the method are performed: (1) evaluation of cyclic shifts of a sequence in its native structure; (2) alignment of a sequence in its native structure, allowing gaps; (3) alignment search with a sequence or sequence fragment in a database of structures; and (4) alignment search with a structure in a database of sequences. The main results are: (1) a native sequence can very well find its native structure among a large number of alternatives, in correct alignment; (2) substructures, such as (beta alpha)n units, can be detected in spite of very low sequence similarity; (3) remote homologous can be detected, with some dependence on the set of parameters used; (4) contact interface parameters are clearly superior to classical secondary structure parameters; (5) a simple interface description in terms of just two states, protein-protein and protein-water contacts, performs surprisingly well; (6) the use of core weights considerably improves accuracy in detection of remote homologues; (7) based on a sequence database search with a myoglobin contact profile, the C-terminal domain of a viral origin of replication binding protein is predicted to have an all-helical fold. The sequence-structure fitness concept is sufficiently general to accommodate a large variety of protein structure prediction methods, including new models of intermediate complexity currently being developed.
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:month
Aug
pubmed:issn
0022-2836
pubmed:author
pubmed:issnType
Print
pubmed:day
5
pubmed:volume
232
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
805-25
pubmed:dateRevised
2006-11-15
pubmed:meshHeading
pubmed:year
1993
pubmed:articleTitle
Prediction of protein structure by evaluation of sequence-structure fitness. Aligning sequences to contact profiles derived from three-dimensional structures.
pubmed:affiliation
Protein Design Group, EMBL, Heidelberg, Germany.
pubmed:publicationType
Journal Article, Research Support, Non-U.S. Gov't