Statements in which the resource exists.
SubjectPredicateObjectContext
pubmed-article:19229086rdf:typepubmed:Citationlld:pubmed
pubmed-article:19229086lifeskim:mentionsumls-concept:C0025663lld:lifeskim
pubmed-article:19229086lifeskim:mentionsumls-concept:C1705104lld:lifeskim
pubmed-article:19229086lifeskim:mentionsumls-concept:C1527021lld:lifeskim
pubmed-article:19229086lifeskim:mentionsumls-concept:C0008902lld:lifeskim
pubmed-article:19229086lifeskim:mentionsumls-concept:C0205554lld:lifeskim
pubmed-article:19229086lifeskim:mentionsumls-concept:C1705313lld:lifeskim
pubmed-article:19229086lifeskim:mentionsumls-concept:C1515273lld:lifeskim
pubmed-article:19229086lifeskim:mentionsumls-concept:C1548559lld:lifeskim
pubmed-article:19229086lifeskim:mentionsumls-concept:C0443324lld:lifeskim
pubmed-article:19229086pubmed:issue4lld:pubmed
pubmed-article:19229086pubmed:dateCreated2009-2-20lld:pubmed
pubmed-article:19229086pubmed:abstractTextIn vector space model (VSM), text representation is the task of transforming the content of a textual document into a vector in the term space so that the document could be recognized and classified by a computer or a classifier. Different terms (i.e. words, phrases, or any other indexing units used to identify the contents of a text) have different importance in a text. The term weighting methods assign appropriate weights to the terms to improve the performance of text categorization. In this study, we investigate several widely-used unsupervised (traditional) and supervised term weighting methods on benchmark data collections in combination with SVM and kappa NN algorithms. In consideration of the distribution of relevant documents in the collection, we propose a new simple supervised term weighting method, i.e. tf.rf, to improve the terms' discriminating power for text categorization task. From the controlled experimental results, these supervised term weighting methods have mixed performance. Specifically, our proposed supervised term weighting method, tf.rf, has a consistently better performance than other term weighting methods while other supervised term weighting methods based on information theory or statistical metric perform the worst in all experiments. On the other hand, the popularly used tf.idf method has not shown a uniformly good performance in terms of different data sets.lld:pubmed
pubmed-article:19229086pubmed:languageenglld:pubmed
pubmed-article:19229086pubmed:journalhttp://linkedlifedata.com/r...lld:pubmed
pubmed-article:19229086pubmed:statusPubMed-not-MEDLINElld:pubmed
pubmed-article:19229086pubmed:monthAprlld:pubmed
pubmed-article:19229086pubmed:issn0162-8828lld:pubmed
pubmed-article:19229086pubmed:authorpubmed-author:RakS LSLlld:pubmed
pubmed-article:19229086pubmed:authorpubmed-author:GunnR NRNlld:pubmed
pubmed-article:19229086pubmed:authorpubmed-author:LuYueYlld:pubmed
pubmed-article:19229086pubmed:authorpubmed-author:TanChew LimCLlld:pubmed
pubmed-article:19229086pubmed:issnTypePrintlld:pubmed
pubmed-article:19229086pubmed:volume31lld:pubmed
pubmed-article:19229086pubmed:ownerNLMlld:pubmed
pubmed-article:19229086pubmed:authorsCompleteYlld:pubmed
pubmed-article:19229086pubmed:pagination721-35lld:pubmed
pubmed-article:19229086pubmed:year2009lld:pubmed
pubmed-article:19229086pubmed:articleTitleSupervised and traditional term weighting methods for automatic text categorization.lld:pubmed
pubmed-article:19229086pubmed:affiliationDepartment of Computer Science and Technology, East China Normal University, Shanghai, China. mlan@cs.ecnu.edu.cnlld:pubmed
pubmed-article:19229086pubmed:publicationTypeJournal Articlelld:pubmed