Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
5
pubmed:dateCreated
2008-10-27
pubmed:abstractText
One frontier of modern statistical research is the problems arising from data sets with extremely large k (>1000) populations, e.g. microarray and neuroimaging data. For many such problems the focus shifts from testing for significance to selecting, filtering, or screening. Classical Ranking and Selection Methodology (RSM) studied the probability of correct selection (PCS). PCS is the probability that the "best" (t = 1) of k populations is truly selected, according to some specified criteria of best. This paper extends and adapts two selection goals from the RSM literature that are suitable for large k problems (d-best and G-best selection). It is then shown how estimation of PCS for selecting multiple (t > 1) populations with d-best and G-best selection can be implemented to provide a useful measure of the quality of a given selection. A simulation study and the application of the proposed method to a benchmark microarray data set show it is an effective and versatile tool for assessing the probability that a particular gene selection or gene filtering step truly obtains the best genes. Moreover, the proposed method is fully general and may be applied to any such extremely large k problem.
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:status
MEDLINE
pubmed:month
Oct
pubmed:issn
1521-4036
pubmed:author
pubmed:issnType
Electronic
pubmed:volume
50
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
870-83
pubmed:meshHeading
pubmed:year
2008
pubmed:articleTitle
On the probability of correct selection for large k populations, with application to microarray data.
pubmed:affiliation
Department of Statistics, University of California, Riverside, CA 92521, USA. xinping.cui@ucr.edu
pubmed:publicationType
Journal Article, Research Support, Non-U.S. Gov't