Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
8
pubmed:dateCreated
2007-5-1
pubmed:abstractText
Complex genomes contain numerous repeated sequences, and genomic duplication is believed to be a main evolutionary mechanism to obtain new functions. Several tools are available for de novo repeat sequence identification, and many approaches exist for clustering homologous protein sequences. We present an efficient new approach to identify and cluster homologous DNA sequences with high accuracy at the level of whole genomes, excluding low-complexity repeats, tandem repeats and annotated interspersed repeats. We also determine the boundaries of each group member so that it closely represents a biological unit, e.g. a complete gene, or a partial gene coding a protein domain.
pubmed:grant
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:status
MEDLINE
pubmed:month
Apr
pubmed:issn
1367-4811
pubmed:author
pubmed:issnType
Electronic
pubmed:day
15
pubmed:volume
23
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
917-25
pubmed:dateRevised
2009-11-4
pubmed:meshHeading
pubmed:year
2007
pubmed:articleTitle
HomologMiner: looking for homologous genomic groups in whole genomes.
pubmed:affiliation
Department of Computer Science & Engineering, Penn State University, PA, USA. mhou@cse.psu.edu
pubmed:publicationType
Journal Article, Research Support, N.I.H., Extramural