Source:http://linkedlifedata.com/resource/pubmed/id/11751218
Switch to
Predicate | Object |
---|---|
rdf:type | |
lifeskim:mentions | |
pubmed:issue |
12
|
pubmed:dateCreated |
2001-12-25
|
pubmed:abstractText |
The process of determining the functional sequence content of an organism is confounded by several factors. Large protein coding sequences are relatively easy to find by statistical methods. Smaller proteins however may escape detection due to their size falling below some arbitrary researcher-defined minimum cutoff, or the inability to precisely define a promoter, or translational start (Delcher et al., Nucleic Acids Res., 27, 4636-4641, 1999). Promoter and regulatory sequences themselves are difficult to define due to a significant amount of allowable sequence variation, as well as a probable lack of any completely accurate whole-organismal gene catalogs to date. Finally, certain genes coding functional RNAs may have insufficient structural or sequence constraints to be detectable by normal sequence structure/pattern searching methods (Eddy and Rivas, Bioinformatics, 16, 583-605, 2000). In those cases where there are multiple closely related organisms that have been sequenced, there is additional information that may be used in the investigation of sequence content-that being the possible conserved nature of functional sequences between the organisms. We present a method for the utilization of this conserved information to detect genes and other potentially functional sequences that may be missed by standard ORF-calling, RNA finding, and pattern matching software. The tricross programs produce a multi-way cross comparison of three sets of sequences, determine which are conserved in all three sets, and produce a graphical (Virtual Reality Modelling Language-VRML; (ISO/IEC 14772-1: 1997, VDC), 1997) representation as well as alignments of all sequence triples found. The software can also be applied to a pair of sequence sets, though the noise in the results increases.
|
pubmed:grant | |
pubmed:language |
eng
|
pubmed:journal | |
pubmed:citationSubset |
IM
|
pubmed:chemical | |
pubmed:status |
MEDLINE
|
pubmed:month |
Dec
|
pubmed:issn |
1367-4803
|
pubmed:author | |
pubmed:issnType |
Print
|
pubmed:volume |
17
|
pubmed:owner |
NLM
|
pubmed:authorsComplete |
Y
|
pubmed:pagination |
1105-12
|
pubmed:dateRevised |
2007-11-14
|
pubmed:meshHeading |
pubmed-meshheading:11751218-Base Sequence,
pubmed-meshheading:11751218-DNA, Archaeal,
pubmed-meshheading:11751218-DNA, Intergenic,
pubmed-meshheading:11751218-Genome, Archaeal,
pubmed-meshheading:11751218-Molecular Sequence Data,
pubmed-meshheading:11751218-Pyrococcus,
pubmed-meshheading:11751218-Software
|
pubmed:year |
2001
|
pubmed:articleTitle |
Tricross : using dot-plots in sequence-id space to detect uncataloged intergenic features.
|
pubmed:affiliation |
Children's Research Institute, The Ohio State University, 700 Childrens Dr., Columbus, OH 43205, USA. ray@biosci.ohio-state.edu
|
pubmed:publicationType |
Journal Article,
Research Support, U.S. Gov't, P.H.S.,
Research Support, U.S. Gov't, Non-P.H.S.,
Research Support, Non-U.S. Gov't
|