Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
2
pubmed:dateCreated
2003-7-21
pubmed:abstractText
We introduce a novel, linguistic-like method of genome analysis. We propose a natural approach to characterizing genomic sequences based on occurrences of fixed length words from a predefined, sufficiently large set of words (strings over the alphabet [A, C, G, T]). A measure based on this approach is called compositional spectrum and is actually a histogram of imperfect word occurrences. Our results assert that the compositional spectrum is an overall characteristic of a long sequence i.e., a complete genome or an uninterrupted part of a chromosome. This attribute is manifested in the similarity of spectra obtained on different stretches of the same genome, and simultaneously in a broad range of dissimilarities between spectral representations of different genomes. High flexibility characterizes this approach due to imperfect matching and as a result sets of relatively long words can be considered. The proposed approach may have various applications in intra- and intergenomic sequence comparisons.
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:issn
0001-5342
pubmed:author
pubmed:issnType
Print
pubmed:volume
51
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
73-89
pubmed:dateRevised
2007-11-15
pubmed:meshHeading
pubmed:year
2003
pubmed:articleTitle
A large-scale comparison of genomic sequences: one promising approach.
pubmed:affiliation
Institute of Evolution, University of Haifa, Mount Carmel, Haifa 31905, Israel. valery@esti.haifa.ac.il
pubmed:publicationType
Journal Article, Comparative Study, Research Support, Non-U.S. Gov't