Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
4
pubmed:dateCreated
1997-3-10
pubmed:abstractText
We present a method for condensing the information in multiple alignments of proteins into a mixture of Dirichlet densities over amino acid distributions. Dirichlet mixture densities are designed to be combined with observed amino acid frequencies to form estimates of expected amino acid probabilities at each position in a profile, hidden Markov model or other statistical model. These estimates give a statistical model greater generalization capacity, so that remotely related family members can be more reliably recognized by the model. This paper corrects the previously published formula for estimating these expected probabilities, and contains complete derivations of the Dirichlet mixture formulas, methods for optimizing the mixtures to match particular databases, and suggestions for efficient implementation.
pubmed:grant
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:month
Aug
pubmed:issn
0266-7061
pubmed:author
pubmed:issnType
Print
pubmed:volume
12
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
327-45
pubmed:dateRevised
2007-11-15
pubmed:meshHeading
pubmed:year
1996
pubmed:articleTitle
Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology.
pubmed:affiliation
Baskin Center for Computer Engineering and Information Sciences, University of California at Santa Cruz 95064, USA. kimmen@cse.ucsc.edu
pubmed:publicationType
Journal Article, Comparative Study, Research Support, U.S. Gov't, P.H.S., Research Support, U.S. Gov't, Non-P.H.S., Research Support, Non-U.S. Gov't