Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
3
pubmed:dateCreated
1997-2-4
pubmed:abstractText
The relative abundance and rarity of DNA words have been recognized in previous biological studies to have implications for the regulation, repair, and evolutionary mechanisms of a genome. In this paper, we review several different measures of abundance and rarity of DNA words, including z-scores, representation ratios, and cross-ratios, that have appeared in the recent literature, and examine the concordance among them using the human cytomegalovirus genome sequence. We then rank all words of length k = 2, ..., 5 of seven herpesvirus genomes according to their abundance, as measured by one of the z-scores based upon a stationary Markov model of order k-2. Using a simple metric on the ranks of 2-words of the seven herpesvirus sequences, we construct an evolutionary tree. Several 3-words are observed to be consistently over- or underrepresented in all seven herpesviruses. Furthermore, clusters of some of the most over- and underrepresented 4- and 5-words in the genomes are identified with functional sites such as the origins of replication and regulatory signals of individual viruses.
pubmed:commentsCorrections
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:issn
1066-5277
pubmed:author
pubmed:issnType
Print
pubmed:volume
3
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
345-60
pubmed:dateRevised
2006-11-15
pubmed:meshHeading
pubmed:year
1996
pubmed:articleTitle
Over- and underrepresentation of short DNA words in herpesvirus genomes.
pubmed:affiliation
Division of Mathematics and Statistics, University of Texas at San Antonio 78249, USA. leung@minuet.utsa.edu
pubmed:publicationType
Journal Article, Research Support, U.S. Gov't, Non-P.H.S., Research Support, Non-U.S. Gov't