Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
12
pubmed:dateCreated
1998-7-29
pubmed:abstractText
Analysis of a newly sequenced bacterial genome starts with identification of protein-coding genes. Functional assignment of proteins requires the exact knowledge of protein N-termini. We present a new program ORPHEUS that identifies candidate genes and accurately predicts gene starts. The analysis starts with a database similarity search and identification of reliable gene fragments. The latter are used to derive statistical characteristics of protein-coding regions and ribosome-binding sites and to predict the complete set of genes in the analyzed genome. In a test on Bacillus subtilis and Escherichia coli genomes, the program correctly identified 93.3% (resp. 96.3%) of experimentally annotated genes longer than 100 codons described in the PIR-International database, and for these genes 96.3% (83.9%) of starts were predicted exactly. Furthermore, 98.9% (99.1%) of genes longer than 100 codons annotated in GenBank were found, and 92.9% (75.7%) of predicted starts coincided with the feature table description. Finally, for the complete gene complements of B.subtilis and E.coli , including genes shorter than 100 codons, gene prediction accuracy was 88.9 and 87.1%, respectively, with 94.2 and 76.7% starts coinciding with the existing annotation.
pubmed:commentsCorrections
http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-1480466, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-1641000, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-2231712, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-2464068, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-2507523, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-3118331, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-3525846, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-3937765, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-4598299, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-7497122, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-7920643, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-7984428, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-7984429, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-8029015, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-8165145, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-8485583, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-8688087, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-8723345, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-8743681, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-8783942, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-8799154, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-8901547, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-9051728, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-9149143, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-9278503, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-9384377, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-9399794, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-9399796, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-9403055, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-9421513, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-9461475, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-9689213, http://linkedlifedata.com/resource/pubmed/commentcorrection/9611239-9697189
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:month
Jun
pubmed:issn
0305-1048
pubmed:author
pubmed:issnType
Print
pubmed:day
15
pubmed:volume
26
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
2941-7
pubmed:dateRevised
2009-11-18
pubmed:meshHeading
pubmed:year
1998
pubmed:articleTitle
Combining diverse evidence for gene recognition in completely sequenced bacterial genomes.
pubmed:affiliation
Munich Information Center for Protein Sequences (MIPS) of the German National Center for Health and Environment (GSF), Am Klopferspitz 18a, 82152 Martinsried, Germany. frishman@mips.biochem.mpg.de
pubmed:publicationType
Journal Article, Research Support, Non-U.S. Gov't