11331239

Source:http://linkedlifedata.com/resource/pubmed/id/11331239

Download in:

Switch to

Custom View

Named Graph Language Inference

Statements in which the resource exists as a subject.
Predicate	Object
rdf:type	pubmed:Citation
lifeskim:mentions	umls-concept:C0002518, umls-concept:C0205245, umls-concept:C0456387, umls-concept:C1514861, umls-concept:C1882932
pubmed:issue	5
pubmed:dateCreated	2001-5-1
pubmed:abstractText	MOTIVATION: Data Mining Prediction (DMP) is a novel approach to predicting protein functional class from sequence. DMP works even in the absence of a homologous protein of known function. We investigate the utility of different ways of representing protein sequence in DMP (residue frequencies, phylogeny, predicted structure) using the Escherichia coli genome as a model. RESULTS: Using the different representations DMP learnt prediction rules that were more accurate than default at every level of function using every type of representation. The most effective way to represent sequence was using phylogeny (75% accuracy and 13% coverage of unassigned ORFs at the most general level of function: 69% accuracy and 7% coverage at the most detailed). We tested different methods for combining predictions from the different types of representation. These improved both the accuracy and coverage of predictions, e.g. 40% of all unassigned ORFs could be predicted at an estimated accuracy of 60% and 5% of unassigned ORFs could be predicted at an estimated accuracy of 86%.
pubmed:language	eng
pubmed:journal	http://linkedlifedata.com/resource/pubmed/journal/9808944
pubmed:citationSubset	IM
pubmed:chemical	http://linkedlifedata.com/resource/pubmed/chemical/Bacterial Proteins, http://linkedlifedata.com/resource/pubmed/chemical/Proteins
pubmed:status	MEDLINE
pubmed:month	May
pubmed:issn	1367-4803
pubmed:author	pubmed-author:ClaraMM, pubmed-author:DehaspeLL, pubmed-author:KarwathAA, pubmed-author:KingR DRD
pubmed:issnType	Print
pubmed:volume	17
pubmed:owner	NLM
pubmed:authorsComplete	Y
pubmed:pagination	445-54
pubmed:meshHeading	pubmed-meshheading:11331239-Bacterial Proteins, pubmed-meshheading:11331239-Computational Biology, pubmed-meshheading:11331239-Escherichia coli, pubmed-meshheading:11331239-Open Reading Frames, pubmed-meshheading:11331239-Proteins, pubmed-meshheading:11331239-Sequence Analysis, Protein, pubmed-meshheading:11331239-Software Design
pubmed:year	2001
pubmed:articleTitle	The utility of different representations of protein sequence for predicting functional class.
pubmed:affiliation	Department of Computer Science, University of Wales, Aberystwyth, Penglais, Aberystwyth, Ceredigion SY23 3DB, Wales, UK. rdk@aber.ac.uk
pubmed:publicationType	Journal Article