Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
1
pubmed:dateCreated
1990-5-31
pubmed:abstractText
Statistical methodology for the identification and characterization of protein binding sites in a set of unaligned DNA fragments is presented. Each sequence must contain at least one common site. No alignment of the sites is required. Instead, the uncertainty in the location of the sites is handled by employing the missing information principle to develop an "expectation maximization" (EM) algorithm. This approach allows for the simultaneous identification of the sites and characterization of the binding motifs. The reliability of the algorithm increases with the number of fragments, but the computations increase only linearly. The method is illustrated with an example, using known cyclic adenosine monophosphate receptor protein (CRP) binding sites. The final motif is utilized in a search for undiscovered CRP binding sites.
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:issn
0887-3585
pubmed:author
pubmed:issnType
Print
pubmed:volume
7
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
41-51
pubmed:dateRevised
2007-11-15
pubmed:meshHeading
pubmed:year
1990
pubmed:articleTitle
An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences.
pubmed:affiliation
Biometrics Laboratory, Wadsworth Center for Laboratories and Research, New York State Department of Health, Albany 12201.
pubmed:publicationType
Journal Article