Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
9
pubmed:dateCreated
2005-4-27
pubmed:abstractText
Phylogenetic analysis of protein sequences is widely used in protein function classification and delineation of subfamilies within larger families. In addition, the recent increase in the number of protein sequence entries with controlled vocabulary terms describing function (e.g. the Gene Ontology) suggests that it may be possible to overlay these terms onto phylogenetic trees to automatically locate functional divergence events in protein family evolution. Phylogenetic analysis of large datasets requires fast algorithms; and even 'fast', approximate distance matrix-based phylogenetic algorithms are slow on large datasets since they involve calculating maximum likelihood estimates of pairwise evolutionary distances. There have been many attempts to classify protein sequences on the family and subfamily level without reconstructing phylogenetic trees, but using hierarchical clustering with simpler distance measures, which also produce trees or dendrograms. How can these trees be compared in their ability to accurately classify protein sequences?
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:month
May
pubmed:issn
1367-4803
pubmed:author
pubmed:issnType
Print
pubmed:day
1
pubmed:volume
21
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
1876-90
pubmed:dateRevised
2006-11-15
pubmed:meshHeading
pubmed:year
2005
pubmed:articleTitle
On the quality of tree-based protein classification.
pubmed:affiliation
Computational Biology Department, Applied Biosystems, Foster City, CA 94404, USA. betty.lazareva@fc.celera.com
pubmed:publicationType
Journal Article, Comparative Study, Evaluation Studies