19291098

Source:http://linkedlifedata.com/resource/pubmed/id/19291098

Download in:

Switch to

Custom View

Named Graph Language Inference

Statements in which the resource exists as a subject.
Predicate	Object
rdf:type	pubmed:Citation
lifeskim:mentions	umls-concept:C0025663, umls-concept:C0085732, umls-concept:C0206031, umls-concept:C1511790, umls-concept:C1704675
pubmed:issue	Pt 3
pubmed:dateCreated	2009-5-21
pubmed:abstractText	Most common human diseases are likely to have complex etiologies. Methods of analysis that allow for the phenomenon of epistasis are of growing interest in the genetic dissection of complex diseases. By allowing for epistatic interactions between potential disease loci, we may succeed in identifying genetic variants that might otherwise have remained undetected. Here we aimed to analyze the ability of logistic regression (LR) and two tree-based supervised learning methods, classification and regression trees (CART) and random forest (RF), to detect epistasis. Multifactor-dimensionality reduction (MDR) was also used for comparison. Our approach involves first the simulation of datasets of autosomal biallelic unphased and unlinked single nucleotide polymorphisms (SNPs), each containing a two-loci interaction (causal SNPs) and 98 'noise' SNPs. We modelled interactions under different scenarios of sample size, missing data, minor allele frequencies (MAF) and several penetrance models: three involving both (indistinguishable) marginal effects and interaction, and two simulating pure interaction effects. In total, we have simulated 99 different scenarios. Although CART, RF, and LR yield similar results in terms of detection of true association, CART and RF perform better than LR with respect to classification error. MAF, penetrance model, and sample size are greater determining factors than percentage of missing data in the ability of the different techniques to detect true association. In pure interaction models, only RF detects association. In conclusion, tree-based methods and LR are important statistical tools for the detection of unknown interactions among true risk-associated SNPs with marginal effects and in the presence of a significant number of noise SNPs. In pure interaction models, RF performs reasonably well in the presence of large sample sizes and low percentages of missing data. However, when the study design is suboptimal (unfavourable to detect interaction in terms of e.g. sample size and MAF) there is a high chance of detecting false, spurious associations.
pubmed:language	eng
pubmed:journal	http://linkedlifedata.com/resource/pubmed/journal/0416661
pubmed:citationSubset	IM
pubmed:status	MEDLINE
pubmed:month	May
pubmed:issn	1469-1809
pubmed:author	pubmed-author:CaoRicardoR, pubmed-author:García-MagariñosManuelM, pubmed-author:López-de-UllibarriIñakiI, pubmed-author:SalasAntonioA
pubmed:issnType	Electronic
pubmed:volume	73
pubmed:owner	NLM
pubmed:authorsComplete	Y
pubmed:pagination	360-9
pubmed:meshHeading	pubmed-meshheading:19291098-Computational Biology, pubmed-meshheading:19291098-Computer Simulation, pubmed-meshheading:19291098-Epistasis, Genetic, pubmed-meshheading:19291098-Gene Frequency, pubmed-meshheading:19291098-Humans, pubmed-meshheading:19291098-Logistic Models, pubmed-meshheading:19291098-Models, Genetic, pubmed-meshheading:19291098-Models, Statistical, pubmed-meshheading:19291098-Polymorphism, Single Nucleotide
pubmed:year	2009
pubmed:articleTitle	Evaluating the ability of tree-based methods and logistic regression for the detection of SNP-SNP interaction.
pubmed:affiliation	Unidade de Xenética, Instituto de Medicina Legal and Departamento de Anatomía Patológica y Ciencias Forenses, Facultade de Medicina, Universidade de Santiago de Compostela, Galicia, Spain.
pubmed:publicationType	Journal Article, Research Support, Non-U.S. Gov't, Evaluation Studies