rdf:type |
|
lifeskim:mentions |
|
pubmed:issue |
5
|
pubmed:dateCreated |
2010-2-25
|
pubmed:abstractText |
MOTIVATION: The identification of catalytic residues is a key step in understanding the function of enzymes. While a variety of computational methods have been developed for this task, accuracies have remained fairly low. The best existing method exploits information from sequence and structure to achieve a precision (the fraction of predicted catalytic residues that are catalytic) of 18.5% at a corresponding recall (the fraction of catalytic residues identified) of 57% on a standard benchmark. Here we present a new method, Discern, which provides a significant improvement over the state-of-the-art through the use of statistical techniques to derive a model with a small set of features that are jointly predictive of enzyme active sites. RESULTS: In cross-validation experiments on two benchmark datasets from the Catalytic Site Atlas and CATRES resources containing a total of 437 manually curated enzymes spanning 487 SCOP families, Discern increases catalytic site recall between 12% and 20% over methods that combine information from both sequence and structure, and by >or=50% over methods that make use of sequence conservation signal only. Controlled experiments show that Discern's improvement in catalytic residue prediction is derived from the combination of three ingredients: the use of the INTREPID phylogenomic method to extract conservation information; the use of 3D structure data, including features computed for residues that are proximal in the structure; and a statistical regularization procedure to prevent overfitting.
|
pubmed:grant |
|
pubmed:commentsCorrections |
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-10815774,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-11292355,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-11478868,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-11575940,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-11588250,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-11606719,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-12181318,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-12421562,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-12475199,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-12662930,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-12850142,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-1438297,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-14630653,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-14681372,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-14681376,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-14681391,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-14980020,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-15010543,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-15033369,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-15037084,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-15201051,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-15201400,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-15318951,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-15456910,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-1546324,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-1554694,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-15701681,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-15980475,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-16003488,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-16037208,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-16790052,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-16995956,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-17189479,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-17519246,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-18096640,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-18174181,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-18654633,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-18776193,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-19558703,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-275827,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-332063,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-6667333,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-7613462,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-7661899,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-7723011,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-7749921,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-8609611,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-8609628,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-9015368,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-9188684,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-9254694,
http://linkedlifedata.com/resource/pubmed/commentcorrection/20080507-9719646
|
pubmed:language |
eng
|
pubmed:journal |
|
pubmed:citationSubset |
IM
|
pubmed:chemical |
|
pubmed:status |
MEDLINE
|
pubmed:month |
Mar
|
pubmed:issn |
1367-4811
|
pubmed:author |
|
pubmed:issnType |
Electronic
|
pubmed:day |
1
|
pubmed:volume |
26
|
pubmed:owner |
NLM
|
pubmed:authorsComplete |
Y
|
pubmed:pagination |
617-24
|
pubmed:dateRevised |
2010-9-28
|
pubmed:meshHeading |
pubmed-meshheading:20080507-Binding Sites,
pubmed-meshheading:20080507-Catalysis,
pubmed-meshheading:20080507-Catalytic Domain,
pubmed-meshheading:20080507-Databases, Protein,
pubmed-meshheading:20080507-Evolution, Molecular,
pubmed-meshheading:20080507-Models, Molecular,
pubmed-meshheading:20080507-Protein Conformation,
pubmed-meshheading:20080507-Protein Folding,
pubmed-meshheading:20080507-Proteins,
pubmed-meshheading:20080507-Proteomics,
pubmed-meshheading:20080507-Sequence Analysis, Protein
|
pubmed:year |
2010
|
pubmed:articleTitle |
Active site prediction using evolutionary and structural information.
|
pubmed:affiliation |
Computer Science Division, University of California, Berkeley, USA.
|
pubmed:publicationType |
Journal Article,
Research Support, U.S. Gov't, Non-P.H.S.,
Research Support, N.I.H., Extramural
|