Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:dateCreated
2003-2-26
pubmed:abstractText
A growing body of work is devoted to the extraction of protein or gene interaction information from the scientific literature. Yet, the basis for most extraction algorithms, i.e. the specific and sensitive recognition of protein and gene names and their numerous synonyms, has not been adequately addressed. Here we describe the construction of a comprehensive general purpose name dictionary and an accompanying automatic curation procedure based on a simple token model of protein names. We designed an efficient search algorithm to analyze all abstracts in MEDLINE in a reasonable amount of time on standard computers. The parameters of our method are optimized using machine learning techniques. Used in conjunction, these ingredients lead to good search performance. A supplementary web page is available at http://cartan.gmd.de/ProMiner/.
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:issn
1793-5091
pubmed:author
pubmed:issnType
Print
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
403-14
pubmed:dateRevised
2007-11-15
pubmed:meshHeading
pubmed:year
2003
pubmed:articleTitle
Playing biology's name game: identifying protein names in scientific text.
pubmed:affiliation
Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, D-53754 Sankt Augustin, Germany. Daniel.Hanisch@scai.fhg.de
pubmed:publicationType
Journal Article, Research Support, Non-U.S. Gov't, Validation Studies