Source:http://linkedlifedata.com/resource/pubmed/id/18998887
Switch to
Predicate | Object |
---|---|
rdf:type | |
lifeskim:mentions | |
pubmed:dateCreated |
2008-11-12
|
pubmed:abstractText |
Many policies and projects now encourage investigators to share their raw research data with other scientists. Unfortunately, it is difficult to measure the effectiveness of these initiatives because data can be shared in such a variety of mechanisms and locations. We propose a novel approach to finding shared datasets: using NLP techniques to identify declarations of dataset sharing within the full text of primary research articles. Using regular expression patterns and machine learning algorithms on open access biomedical literature, our system was able to identify 61% of articles with shared datasets with 80% precision. A simpler version of our classifier achieved higher recall (86%), though lower precision (49%). We believe our results demonstrate the feasibility of this approach and hope to inspire further study of dataset retrieval techniques and policy evaluation.
|
pubmed:grant | |
pubmed:commentsCorrections |
http://linkedlifedata.com/resource/pubmed/commentcorrection/18998887-16436574,
http://linkedlifedata.com/resource/pubmed/commentcorrection/18998887-16822095,
http://linkedlifedata.com/resource/pubmed/commentcorrection/18998887-17077452,
http://linkedlifedata.com/resource/pubmed/commentcorrection/18998887-17238312,
http://linkedlifedata.com/resource/pubmed/commentcorrection/18998887-17375194
|
pubmed:language |
eng
|
pubmed:journal | |
pubmed:citationSubset |
IM
|
pubmed:status |
MEDLINE
|
pubmed:issn |
1942-597X
|
pubmed:author | |
pubmed:issnType |
Electronic
|
pubmed:owner |
NLM
|
pubmed:authorsComplete |
Y
|
pubmed:pagination |
596-600
|
pubmed:meshHeading |
pubmed-meshheading:18998887-Algorithms,
pubmed-meshheading:18998887-Artificial Intelligence,
pubmed-meshheading:18998887-Cooperative Behavior,
pubmed-meshheading:18998887-Information Dissemination,
pubmed-meshheading:18998887-Information Storage and Retrieval,
pubmed-meshheading:18998887-Natural Language Processing,
pubmed-meshheading:18998887-Pattern Recognition, Automated,
pubmed-meshheading:18998887-Periodicals as Topic,
pubmed-meshheading:18998887-Subject Headings
|
pubmed:year |
2008
|
pubmed:articleTitle |
Identifying data sharing in biomedical literature.
|
pubmed:affiliation |
University of Pittsburgh, Pittsburgh, PA, USA.
|
pubmed:publicationType |
Journal Article,
Research Support, N.I.H., Extramural
|