Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
8
pubmed:dateCreated
2010-2-25
pubmed:abstractText
Fast identification of protein structures that are similar to a specified query structure in the entire Protein Data Bank (PDB) is fundamental in structure and function prediction. We present FragBag: An ultrafast and accurate method for comparing protein structures. We describe a protein structure by the collection of its overlapping short contiguous backbone segments, and discretize this set using a library of fragments. Then, we succinctly represent the protein as a "bags-of-fragments"-a vector that counts the number of occurrences of each fragment-and measure the similarity between two structures by the similarity between their vectors. Our representation has two additional benefits: (i) it can be used to construct an inverted index, for implementing a fast structural search engine of the entire PDB, and (ii) one can specify a structure as a collection of substructures, without combining them into a single structure; this is valuable for structure prediction, when there are reliable predictions only of parts of the protein. We use receiver operating characteristic curve analysis to quantify the success of FragBag in identifying neighbor candidate sets in a dataset of over 2,900 structures. The gold standard is the set of neighbors found by six state of the art structural aligners. Our best FragBag library finds more accurate candidate sets than the three other filter methods: The SGM, PRIDE, and a method by Zotenko et al. More interestingly, FragBag performs on a par with the computationally expensive, yet highly trusted structural aligners STRUCTAL and CE.
pubmed:commentsCorrections
http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-10339815, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-11812155, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-12381322, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-12506205, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-12520050, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-14636603, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-14962928, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-14985506, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-15299650, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-15335781, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-15572779, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-15701525, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-15963890, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-15980462, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-16381864, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-16678402, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-16762072, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-17237095, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-17335583, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-17570145, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-17826686, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-17918729, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-18004789, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-18436442, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-19269161, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-2769748, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-7723011, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-8377180, http://linkedlifedata.com/resource/pubmed/commentcorrection/20133727-9796821
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:status
MEDLINE
pubmed:month
Feb
pubmed:issn
1091-6490
pubmed:author
pubmed:issnType
Electronic
pubmed:day
23
pubmed:volume
107
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
3481-6
pubmed:dateRevised
2010-9-27
pubmed:meshHeading
pubmed:year
2010
pubmed:articleTitle
FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately.
pubmed:affiliation
Department of Computer Science, University of Haifa, Mount Carmel, Haifa 31905, Israel.
pubmed:publicationType
Journal Article, Research Support, Non-U.S. Gov't