Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:dateCreated
2003-7-11
pubmed:abstractText
MOTIVATION: The biological literature is a major repository of knowledge. Many biological databases draw much of their content from a careful curation of this literature. However, as the volume of literature increases, the burden of curation increases. Text mining may provide useful tools to assist in the curation process. To date, the lack of standards has made it impossible to determine whether text mining techniques are sufficiently mature to be useful. RESULTS: We report on a Challenge Evaluation task that we created for the Knowledge Discovery and Data Mining (KDD) Challenge Cup. We provided a training corpus of 862 articles consisting of journal articles curated in FlyBase, along with the associated lists of genes and gene products, as well as the relevant data fields from FlyBase. For the test, we provided a corpus of 213 new ('blind') articles; the 18 participating groups provided systems that flagged articles for curation, based on whether the article contained experimental evidence for gene expression products. We report on the evaluation results and describe the techniques used by the top performing groups.
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:issn
1367-4803
pubmed:author
pubmed:issnType
Print
pubmed:volume
19 Suppl 1
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
i331-9
pubmed:dateRevised
2007-11-15
pubmed:meshHeading
pubmed:year
2003
pubmed:articleTitle
Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup.
pubmed:affiliation
The MITRE Corporation, 202 Burlington Road, Bedford, MA 01730, USA. asy@mitre.org
pubmed:publicationType
Journal Article, Research Support, Non-U.S. Gov't, Evaluation Studies, Validation Studies