Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:dateCreated
2006-2-1
pubmed:abstractText
High throughput protein interaction data sets have proven to be notoriously noisy. Although it is possible to focus on interactions with higher reliability by using only those that are backed up by two or more lines of evidence, this approach invariably throws out the majority of available data. A more optimal use could be achieved by incorporating the probabilities associated with all available interactions into the analysis. We present a novel method for estimating error rates associated with specific protein interaction data sets, as well as with individual interactions given the data sets in which they appear. As a bonus, we also get an estimate for the total number of protein interactions in yeast. Certain types of false positive results can be identified and removed, resulting in a significant improvement in quality of the data set. For co-purification data sets, we show how we can reach a tradeoff between the "spoke" and "matrix" representation of interactions within co-purified groups of proteins to achieve an optimal false positive error rate.
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:issn
1551-7497
pubmed:author
pubmed:issnType
Print
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
216-23
pubmed:dateRevised
2006-11-15
pubmed:meshHeading
pubmed:year
2004
pubmed:articleTitle
Estimating and improving protein interaction error rates.
pubmed:affiliation
Lipper Center for Computational Genetics, Harvard Medical School, USA. patrik@genetics.med.harvard.edu
pubmed:publicationType
Journal Article, Comparative Study, Research Support, U.S. Gov't, Non-P.H.S.