Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
2
pubmed:dateCreated
2007-1-23
pubmed:abstractText
Structural genomics projects are determining the three-dimensional structure of proteins without full characterization of their function. A critical part of the annotation process involves appropriate knowledge representation and prediction of functionally important residue environments. We have developed a method to extract features from sequence, sequence alignments, three-dimensional structure, and structural environment conservation, and used support vector machines to annotate homologous and nonhomologous residue positions based on a specific training set of residue functions. In order to evaluate this pipeline for automated protein annotation, we applied it to the challenging problem of prediction of catalytic residues in enzymes. We also ranked the features based on their ability to discriminate catalytic from noncatalytic residues. When applying our method to a well-annotated set of protein structures, we found that top-ranked features were a measure of sequence conservation, a measure of structural conservation, a degree of uniqueness of a residue's structural environment, solvent accessibility, and residue hydrophobicity. We also found that features based on structural conservation were complementary to those based on sequence conservation and that they were capable of increasing predictor performance. Using a family nonredundant version of the ASTRAL 40 v1.65 data set, we estimated that the true catalytic residues were correctly predicted in 57.0% of the cases, with a precision of 18.5%. When testing on proteins containing novel folds not used in training, the best features were highly correlated with the training on families, thus validating the approach to nonhomologous catalytic residue prediction in general. We then applied the method to 2781 coordinate files from the structural genomics target pipeline and identified both highly ranked and highly clustered groups of predicted catalytic residues.
pubmed:commentsCorrections
http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-10618406, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-11327775, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-11478868, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-12270722, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-12421562, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-12662930, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-12782323, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-12850142, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-14681391, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-14960716, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-15019783, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-15036149, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-15572779, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-15696542, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-15755451, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-15797917, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-15980588, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-16245324, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-16424331, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-16790052, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-6667333, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-7613462, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-7723011, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-7749921, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-8377180, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-8609628, http://linkedlifedata.com/resource/pubmed/commentcorrection/17189479-9254694
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:month
Feb
pubmed:issn
0961-8368
pubmed:author
pubmed:issnType
Print
pubmed:volume
16
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
216-26
pubmed:dateRevised
2009-11-18
pubmed:meshHeading
pubmed:year
2007
pubmed:articleTitle
Evaluation of features for catalytic residue prediction in novel folds.
pubmed:affiliation
Center for Computational Biology and Bioinformatics, Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202, USA.
pubmed:publicationType
Journal Article, Research Support, Non-U.S. Gov't