Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
5
pubmed:dateCreated
2000-1-11
pubmed:abstractText
The recognition of remote protein homologies is a major aspect of the structural and functional annotation of newly determined genomes. Here we benchmark the coverage and error rate of genome annotation using the widely used homology-searching program PSI-BLAST (position-specific iterated basic local alignment search tool). This study evaluates the one-to-many success rate for recognition, as often there are several homologues in the database and only one needs to be identified for annotating the sequence. In contrast, previous benchmarks considered one-to-one recognition in which a single query was required to find a particular target. The benchmark constructs a model genome from the full sequences of the structural classification of protein (SCOP) database and searches against a target library of remote homologous domains (<20 % identity). The structural benchmark provides a reliable list of correct and false homology assignments. PSI-BLAST successfully annotated 40 % of the domains in the model genome that had at least one homologue in the target library. This coverage is more than three times that if one-to-one recognition is evaluated (11 % coverage of domains). Although a structural benchmark was used, the results equally apply to just sequence homology searches. Accordingly, structural and sequence assignments were made to the sequences of Mycoplasma genitalium and Mycobacterium tuberculosis (see http://www.bmm.icnet. uk). The extent of missed assignments and of new superfamilies can be estimated for these genomes for both structural and functional annotations.
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:month
Nov
pubmed:issn
0022-2836
pubmed:author
pubmed:copyrightInfo
Copyright 1999 Academic Press.
pubmed:issnType
Print
pubmed:day
12
pubmed:volume
293
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
1257-71
pubmed:dateRevised
2000-12-18
pubmed:meshHeading
pubmed-meshheading:10547299-Algorithms, pubmed-meshheading:10547299-Bacterial Proteins, pubmed-meshheading:10547299-Benchmarking, pubmed-meshheading:10547299-Computational Biology, pubmed-meshheading:10547299-Conserved Sequence, pubmed-meshheading:10547299-Databases, Factual, pubmed-meshheading:10547299-False Positive Reactions, pubmed-meshheading:10547299-Genome, Bacterial, pubmed-meshheading:10547299-Internet, pubmed-meshheading:10547299-Multigene Family, pubmed-meshheading:10547299-Mycobacterium tuberculosis, pubmed-meshheading:10547299-Mycoplasma, pubmed-meshheading:10547299-Open Reading Frames, pubmed-meshheading:10547299-Sensitivity and Specificity, pubmed-meshheading:10547299-Sequence Alignment, pubmed-meshheading:10547299-Sequence Homology, Amino Acid, pubmed-meshheading:10547299-Software, pubmed-meshheading:10547299-Structure-Activity Relationship
pubmed:year
1999
pubmed:articleTitle
Benchmarking PSI-BLAST in genome annotation.
pubmed:affiliation
Biomolecular Modelling Laboratory, Imperial Cancer Research Fund, 44 Lincoln's Inn Fields, London, WC2A 3PX, England.
pubmed:publicationType
Journal Article