Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
1
pubmed:dateCreated
2006-1-23
pubmed:abstractText
High-throughput screening (HTS) plays a pivotal role in lead discovery for the pharmaceutical industry. In tandem, cheminformatics approaches are employed to increase the probability of the identification of novel biologically active compounds by mining the HTS data. HTS data is notoriously noisy, and therefore, the selection of the optimal data mining method is important for the success of such an analysis. Here, we describe a retrospective analysis of four HTS data sets using three mining approaches: Laplacian-modified naive Bayes, recursive partitioning, and support vector machine (SVM) classifiers with increasing stochastic noise in the form of false positives and false negatives. All three of the data mining methods at hand tolerated increasing levels of false positives even when the ratio of misclassified compounds to true active compounds was 5:1 in the training set. False negatives in the ratio of 1:1 were tolerated as well. SVM outperformed the other two methods in capturing active compounds and scaffolds in the top 1%. A Murcko scaffold analysis could explain the differences in enrichments among the four data sets. This study demonstrates that data mining methods can add a true value to the screen even when the data is contaminated with a high level of stochastic noise.
pubmed:language
eng
pubmed:journal
pubmed:status
PubMed-not-MEDLINE
pubmed:issn
1549-9596
pubmed:author
pubmed:issnType
Print
pubmed:volume
46
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
193-200
pubmed:articleTitle
Enrichment of high-throughput screening data with increasing levels of noise using support vector machines, recursive partitioning, and laplacian-modified naive bayesian classifiers.
pubmed:affiliation
Lead Discovery Center, Novartis Institutes for Biomedical Research Inc., Cambridge, Massachusetts 02139, USA.
pubmed:publicationType
Journal Article