Statements in which the resource exists.
SubjectPredicateObjectContext
pubmed-article:9109037rdf:typepubmed:Citationlld:pubmed
pubmed-article:9109037lifeskim:mentionsumls-concept:C1514562lld:lifeskim
pubmed-article:9109037lifeskim:mentionsumls-concept:C0439064lld:lifeskim
pubmed-article:9109037lifeskim:mentionsumls-concept:C0449820lld:lifeskim
pubmed-article:9109037lifeskim:mentionsumls-concept:C0037775lld:lifeskim
pubmed-article:9109037lifeskim:mentionsumls-concept:C0521115lld:lifeskim
pubmed-article:9109037lifeskim:mentionsumls-concept:C1708943lld:lifeskim
pubmed-article:9109037pubmed:issue1lld:pubmed
pubmed-article:9109037pubmed:dateCreated1997-6-23lld:pubmed
pubmed-article:9109037pubmed:abstractTextSeveral computer algorithms now exist for discovering multiple motifs (expressed as weight matrices) that characterize a family of protein sequences known to be homologous. This paper describes a method for performing similarity searches of protein sequence databases using such a group of motifs. By simultaneously using all the motifs that characterize a protein family, the sensitivity and specificity of the database search are increased. We define the p-value for a target sequence to be the probability of a random sequence of the same length scoring as well or better in comparison to all the motifs that characterize the family. (The p-value of a database search can be determined from this value and the size of the database.) We show that estimating the distribution of single motif scores by a Gaussian extreme value distribution is insufficiently accurate to provide a useful estimate of the p-value, but that this deficiency can be corrected by reestimating the parameters of the underlying Gaussian distribution from observed scores for comparison of a given motif and sequence database. These parameters are used to calculate a "reduced variate" which has a Gumbel limiting distribution. Multiple motif scores are combined to give a single p-value by using the sum of the reduced variates for the motif scores as the test statistic. We give a computationally efficient approximation to the distribution of the sum of independent Gumbel random variables and verify experimentally that it closely approximates the distribution of the test statistic. Experiments on pseudorandom sequences show that the approximated p-values are conservative, so the significance of high scores in database searches will not be overstated. Experiments with real protein sequences and motifs identified by the MEME algorithm show that determining an overall p-value based on the combination of multiple motifs gives significantly better database search results than using p-values of single motifs.lld:pubmed
pubmed-article:9109037pubmed:granthttp://linkedlifedata.com/r...lld:pubmed
pubmed-article:9109037pubmed:languageenglld:pubmed
pubmed-article:9109037pubmed:journalhttp://linkedlifedata.com/r...lld:pubmed
pubmed-article:9109037pubmed:citationSubsetIMlld:pubmed
pubmed-article:9109037pubmed:chemicalhttp://linkedlifedata.com/r...lld:pubmed
pubmed-article:9109037pubmed:statusMEDLINElld:pubmed
pubmed-article:9109037pubmed:issn1066-5277lld:pubmed
pubmed-article:9109037pubmed:authorpubmed-author:BaileyT LTLlld:pubmed
pubmed-article:9109037pubmed:authorpubmed-author:GribskovMMlld:pubmed
pubmed-article:9109037pubmed:issnTypePrintlld:pubmed
pubmed-article:9109037pubmed:volume4lld:pubmed
pubmed-article:9109037pubmed:ownerNLMlld:pubmed
pubmed-article:9109037pubmed:authorsCompleteYlld:pubmed
pubmed-article:9109037pubmed:pagination45-59lld:pubmed
pubmed-article:9109037pubmed:dateRevised2007-11-14lld:pubmed
pubmed-article:9109037pubmed:meshHeadingpubmed-meshheading:9109037-...lld:pubmed
pubmed-article:9109037pubmed:meshHeadingpubmed-meshheading:9109037-...lld:pubmed
pubmed-article:9109037pubmed:year1997lld:pubmed
pubmed-article:9109037pubmed:articleTitleScore distributions for simultaneous matching to multiple motifs.lld:pubmed
pubmed-article:9109037pubmed:affiliationSan Diego Supercomputer Center, California 92186-9784, USA. tbailey@sdsc.edulld:pubmed
pubmed-article:9109037pubmed:publicationTypeJournal Articlelld:pubmed
pubmed-article:9109037pubmed:publicationTypeResearch Support, U.S. Gov't, P.H.S.lld:pubmed
pubmed-article:9109037pubmed:publicationTypeResearch Support, U.S. Gov't, Non-P.H.S.lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed
http://linkedlifedata.com/r...pubmed:referesTopubmed-article:9109037lld:pubmed