Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:dateCreated
2005-7-19
pubmed:abstractText
There is an enormous amount of information encoded in each genome--enough to create living, responsive and adaptive organisms. Raw sequence data alone is not enough to understand function, mechanisms or interactions. Changes in a single base pair can lead to disease, such as sickle-cell anemia, while some large megabase deletions have no apparent phenotypic effect. Genomic features are varied in their data types and annotation of these features is spread across multiple databases. Herein, we develop a method to automate exploration of genomes by iteratively exploring sequence data for correlations and building upon them. First, to integrate and compare different annotation sources, a sequence matrix (SM) is developed to contain position-dependant information. Second, a classification tree is developed for matrix row types, specifying how each data type is to be treated with respect to other data types for analysis purposes. Third, correlative analyses are developed to analyze features of each matrix row in terms of the other rows, guided by the classification tree as to which analyses are appropriate. A prototype was developed and successful in detecting coinciding genomic features among genes, exons, repetitive elements and CpG islands.
pubmed:commentsCorrections
http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-10547828, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-10802651, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-10889045, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-11242984, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-11333864, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-11932033, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-11983053, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-12134150, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-12186644, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-12424129, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-12728276, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-12894190, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-14678565, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-15090078, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-15504238, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-15588317, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-15688072, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-15705189, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-15723693, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-3656447, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-7505451, http://linkedlifedata.com/resource/pubmed/commentcorrection/16026599-8996792
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:status
MEDLINE
pubmed:month
Jul
pubmed:issn
1471-2105
pubmed:author
pubmed:issnType
Electronic
pubmed:day
15
pubmed:volume
6 Suppl 2
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
S2
pubmed:dateRevised
2009-11-18
pubmed:meshHeading
pubmed:year
2005
pubmed:articleTitle
Automating genomic data mining via a sequence-based matrix format and associative rule set.
pubmed:affiliation
Advanced Center for Genome Technology, Department of Botany and Microbiology, 101 David L, Boren Blvd, Rm 2025. Jonathan.Wren@OU.edu
pubmed:publicationType
Journal Article, Research Support, U.S. Gov't, Non-P.H.S.