Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
4
pubmed:dateCreated
2000-6-9
pubmed:abstractText
Ab initio gene identification in the genomic sequence of Drosophila melanogaster was obtained using (human gene predictor) and Fgenesh programs that have organism-specific parameters for human, Drosophila, plants, yeast, and nematode. We did not use information about cDNA/EST in most predictions to model a real situation for finding new genes because information about complete cDNA is often absent or based on very small partial fragments. We investigated the accuracy of gene prediction on different levels and designed several schemes to predict an unambiguous set of genes (annotation CGG1), a set of reliable exons (annotation CGG2), and the most complete set of exons (annotation CGG3). For 49 genes, protein products of which have clear homologs in protein databases, predictions were recomputed by Fgenesh+ program. The first annotation serves as the optimal computational description of new sequence to be presented in a database. Reliable exons from the second annotation serve as good candidates for selecting the PCR primers for experimental work for gene structure verification. Our results shows that we can identify approximately 90% of coding nucleotides with 20% false positives. At the exon level we accurately predicted 65% of exons and 89% including overlapping exons with 49% false positives. Optimizing accuracy of prediction, we designed a gene identification scheme using Fgenesh, which provided sensitivity (Sn) = 98% and specificity (Sp) = 86% at the base level, Sn = 81% (97% including overlapping exons) and Sp = 58% at the exon level and Sn = 72% and Sp = 39% at the gene level (estimating sensitivity on std1 set and specificity on std3 set). In general, these results showed that computational gene prediction can be a reliable tool for annotating new genomic sequences, giving accurate information on 90% of coding sequences with 14% false positives. However, exact gene prediction (especially at the gene level) needs additional improvement using gene prediction algorithms. The program was also tested for predicting genes of human Chromosome 22 (the last variant of Fgenesh can analyze the whole chromosome sequence). This analysis has demonstrated that the 88% of manually annotated exons in Chromosome 22 were among the ab initio predicted exons. The suite of gene identification programs is available through the WWW server of Computational Genomics Group at http://genomic.sanger.ac.uk/gf. html.
pubmed:commentsCorrections
http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-10471707, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-10779478, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-1480466, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-1619647, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-2067018, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-2243775, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-7816600, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-7984429, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-8441672, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-8703057, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-8743681, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-9149143, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-9155028, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-9254694, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-9399790, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-9584193, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-9623988, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-9666329, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-9666331, http://linkedlifedata.com/resource/pubmed/commentcorrection/10779491-9847192
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:month
Apr
pubmed:issn
1088-9051
pubmed:author
pubmed:issnType
Print
pubmed:volume
10
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
516-22
pubmed:dateRevised
2009-11-18
pubmed:meshHeading
pubmed:year
2000
pubmed:articleTitle
Ab initio gene finding in Drosophila genomic DNA.
pubmed:affiliation
The Sanger Centre, Hinxton, Cambridge CB10 1SA, UK.
pubmed:publicationType
Journal Article, Research Support, Non-U.S. Gov't