Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:dateCreated
2008-9-26
pubmed:abstractText
In genome-wide association studies using single nucleotide polymorphisms (SNPs), typically thousands of SNPs are genotyped, whereas the number of phenotypes for which there is genomic information may be smaller. Atwo-step SNP (feature) selection method was developed, which consisted of filtering (using information gain), and wrapping (using naïve Bayesian classification). This was based on discretization of the continuous phenotypic values. The method was applied to chick early mortality rates (0-14 days of age) on progeny from 201 sires in a commercial broiler line, with the goal of identifying SNPs (over 5000) related to progeny mortality. Sires were clustered into two groups, low and high, according to two arbitrarily chosen mortality rate thresholds. By varying these thresholds, 11 different "case-control" samples were formed, and the SNP selection procedure was applied to each sample. To compare the 11 sets of chosen SNPs, predicted residual sum of squares (PRESS)from a linear model was used. Naive Bayesian classification accuracy was improved over the case without feature selection (from 50% to 90%). Seventeen SNPs in the best case-control group (with smallest PRESS) accounted for 31% of the variance among sire family mortality rates.
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:status
MEDLINE
pubmed:issn
1424-6074
pubmed:author
pubmed:issnType
Print
pubmed:volume
132
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
373-6
pubmed:meshHeading
pubmed:year
2008
pubmed:articleTitle
Machine learning classification procedure for selecting SNPs in genomic selection: application to early mortality in broilers.
pubmed:affiliation
Department of Animal Sciences, University of Wisconsin, Madison, WI 53706, USA. nlong@wisc.edu
pubmed:publicationType
Journal Article