Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
2
pubmed:dateCreated
2007-4-25
pubmed:abstractText
As the demand for accurately aligning gene sequences to the genome of a related species grows with the sequencing of new genomes, spaced seeds emerge as a promising vehicle for increasing alignment sensitivity. We extend the existing {0, 1} match-mismatch models for sensitivity evaluation to take into account the compositional structure of coding sequences and ultimately produce seeds better suited to this particular application. Designing seeds for alignment programs, however, needs to balance sensitivity and specificity. We assess the effects of seed variations on both sensitivity and specificity in an extended model that incorporates transitions and differentiates among the three codon positions, and show that spaced seeds with transitions offer a better sensitivity-specificity tradeoff. Furthermore, we propose a theoretical formulation for rigorously assessing seed specificity, starting from Bernoulli and Markov models of the mRNA and genomic sequences. Within this framework, we perform the first comprehensive analysis of seeds to serve as a blueprint for selecting sensitive and specific seeds for practical applications. Our analyses show that specificity is relatively constant for seeds of a given weight, while sensitivity varies widely, with the highest values attained by seeds allowing a small (2-6) number of transitions.A strategy for designing seeds, therefore, is to first select the weight of the seed by identifying the desired sensitivity-specificity tradeoff, then choose the most sensitive seed(s) within that weight group. We illustrate our methods with the alignment of chicken coding sequences against the human genome assembly version HG17.
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:month
Mar
pubmed:issn
1066-5277
pubmed:author
pubmed:issnType
Print
pubmed:volume
14
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
113-30
pubmed:dateRevised
2009-11-3
pubmed:meshHeading
pubmed:year
2007
pubmed:articleTitle
Designing sensitive and specific spaced seeds for cross-species mRNA-to-genome alignment.
pubmed:affiliation
Department of Computer Science, George Washington University, Washington, DC 20052, USA.
pubmed:publicationType
Journal Article, Research Support, U.S. Gov't, Non-P.H.S., Research Support, Non-U.S. Gov't