Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
9
pubmed:dateCreated
2006-4-19
pubmed:abstractText
MOTIVATION: The genome of Arabidopsis thaliana, which has the best understood plant genome, still has approximately one-third of its genes with no functional annotation at all from either MIPS or TAIR. We have applied our Data Mining Prediction (DMP) method to the problem of predicting the functional classes of these protein sequences. This method is based on using a hybrid machine-learning/data-mining method to identify patterns in the bioinformatic data about sequences that are predictive of function. We use data about sequence, predicted secondary structure, predicted structural domain, InterPro patterns, sequence similarity profile and expressions data. RESULTS: We predicted the functional class of a high percentage of the Arabidopsis genes with currently unknown function. These predictions are interpretable and have good test accuracies. We describe in detail seven of the rules produced.
pubmed:commentsCorrections
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:month
May
pubmed:issn
1367-4803
pubmed:author
pubmed:issnType
Print
pubmed:day
1
pubmed:volume
22
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
1130-6
pubmed:dateRevised
2006-11-15
pubmed:meshHeading
pubmed:year
2006
pubmed:articleTitle
Functional bioinformatics for Arabidopsis thaliana.
pubmed:affiliation
Department of Computer Science, University of Wales Aberystwyth SY23 3DB, UK. afc@aber.ac.uk
pubmed:publicationType
Journal Article, Research Support, Non-U.S. Gov't