Source:http://linkedlifedata.com/resource/pubmed/id/16718863
Switch to
Predicate | Object |
---|---|
rdf:type | |
lifeskim:mentions | |
pubmed:issue |
1
|
pubmed:dateCreated |
2006-4-27
|
pubmed:abstractText |
In this paper, we borrow the idea of the receiver operating characteristic (ROC) from clinical medicine and demonstrate its application to sequence comparison. The ROC includes elements of both sensitivity and specificity, and is a quantitative measure of the usefulness of a diagnostic. The ROC is used in this work to investigate the effects of scoring table and gap penalties on database searches. Studies on three families of proteins, 4Fe-4S ferredoxins, lysR bacterial regulatory proteins, and bacterial RNA polymerase sigma-factors lead to the following conclusions: sequence families are quite idiosyncratic, but the best PAM distance for database searches using the Smith-Waterman method is somewhat larger than predicted by theoretical methods, about 200 PAM. The length independent gap penalty (gap initiation penalty) is quite important, but shows a broad peak at values of about 20-24. The length dependent gap penalty (gap extension penalty) is almost irrelevant suggesting that successful database searches rely only to a limited degree on gapped alignments. Taken together, these observations lead to the conclusion that the optimal conditions for alignments and database searches are not, and should not be expected to be, the same.
|
pubmed:grant | |
pubmed:language |
eng
|
pubmed:journal | |
pubmed:citationSubset |
IM
|
pubmed:chemical |
http://linkedlifedata.com/resource/pubmed/chemical/Bacterial Proteins,
http://linkedlifedata.com/resource/pubmed/chemical/Ferredoxins,
http://linkedlifedata.com/resource/pubmed/chemical/LysR protein, Bacteria,
http://linkedlifedata.com/resource/pubmed/chemical/Sigma Factor,
http://linkedlifedata.com/resource/pubmed/chemical/Transcription Factors
|
pubmed:status |
MEDLINE
|
pubmed:month |
Mar
|
pubmed:issn |
0097-8485
|
pubmed:author | |
pubmed:issnType |
Print
|
pubmed:volume |
20
|
pubmed:owner |
NLM
|
pubmed:authorsComplete |
Y
|
pubmed:pagination |
25-33
|
pubmed:dateRevised |
2007-11-14
|
pubmed:meshHeading |
pubmed-meshheading:16718863-Bacterial Proteins,
pubmed-meshheading:16718863-Ferredoxins,
pubmed-meshheading:16718863-ROC Curve,
pubmed-meshheading:16718863-Sequence Alignment,
pubmed-meshheading:16718863-Sequence Analysis,
pubmed-meshheading:16718863-Sigma Factor,
pubmed-meshheading:16718863-Transcription Factors
|
pubmed:year |
1996
|
pubmed:articleTitle |
Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching.
|
pubmed:affiliation |
San Diego Supercomputer Center, P.O. Box 85608, San Diego, CA 92186-9784, USA.
|
pubmed:publicationType |
Journal Article,
Research Support, U.S. Gov't, Non-P.H.S.,
Research Support, N.I.H., Extramural
|