Source:http://linkedlifedata.com/resource/pubmed/id/20001252
Switch to
Predicate | Object |
---|---|
rdf:type | |
lifeskim:mentions | |
pubmed:issue |
12
|
pubmed:dateCreated |
2010-1-5
|
pubmed:abstractText |
Large-scale comparison of the similarities between two biological sequences is a major issue in computational biology; a fast method, the D(2) statistic, relies on the comparison of the k-tuple content for both sequences. Although it has been known for some years that the D(2) statistic is not suitable for this task, as it tends to be dominated by single-sequence noise, to date no suitable adjustments have been proposed. In this article, we suggest two new variants of the D(2) word count statistic, which we call D(2)(S) and D(2)(*). For D(2)(S), which is a self-standardized statistic, we show that the statistic is asymptotically normally distributed, when sequence lengths tend to infinity, and not dominated by the noise in the individual sequences. The second statistic, D(2)(*), outperforms D(2)(S) in terms of power for detecting the relatedness between the two sequences in our examples; but although it is straightforward to simulate from the asymptotic distribution of D(2)(*), we cannot provide a closed form for power calculations.
|
pubmed:grant |
http://linkedlifedata.com/resource/pubmed/grant/,
http://linkedlifedata.com/resource/pubmed/grant/P50 HG 002790,
http://linkedlifedata.com/resource/pubmed/grant/P50 HG002790-01A1,
http://linkedlifedata.com/resource/pubmed/grant/R21 AG032743-01A1,
http://linkedlifedata.com/resource/pubmed/grant/R21AG032743
|
pubmed:commentsCorrections | |
pubmed:language |
eng
|
pubmed:journal | |
pubmed:citationSubset |
IM
|
pubmed:status |
MEDLINE
|
pubmed:month |
Dec
|
pubmed:issn |
1557-8666
|
pubmed:author | |
pubmed:issnType |
Electronic
|
pubmed:volume |
16
|
pubmed:owner |
NLM
|
pubmed:authorsComplete |
Y
|
pubmed:pagination |
1615-34
|
pubmed:dateRevised |
2011-9-26
|
pubmed:meshHeading | |
pubmed:year |
2009
|
pubmed:articleTitle |
Alignment-free sequence comparison (I): statistics and power.
|
pubmed:affiliation |
Department of Statistics, University of Oxford, Oxford OX1 3TG, UK.
|
pubmed:publicationType |
Journal Article,
Research Support, Non-U.S. Gov't,
Research Support, N.I.H., Extramural
|