21030734

Source:http://linkedlifedata.com/resource/pubmed/id/21030734

Download in:

Switch to

Custom View

Named Graph Language Inference

Statements in which the resource exists as a subject.
Predicate	Object
rdf:type	pubmed:Citation
lifeskim:mentions	umls-concept:C0012751, umls-concept:C0205460, umls-concept:C0242356, umls-concept:C0243095, umls-concept:C1880157, umls-concept:C1883037
pubmed:issue	4
pubmed:dateCreated	2010-10-29
pubmed:abstractText	Modern biological applications usually involve the similarity comparison between two objects, which is often computationally very expensive, such as whole genome pairwise alignment and protein 3D structure alignment. Nevertheless, being able to quickly identify the closest neighboring objects from very large databases for a newly obtained sequence or structure can provide timely hints to its functions and more. This paper presents a substantial speedup technique for the well-studied k-nearest neighbor (k-nn) search, based on novel concepts of virtual pivots and partial pivots, such that a significant number of the expensive distance computations can be avoided. The new method is able to dynamically locate virtual pivots, according to the query, with increasing pruning ability. Using the same or less amount of database preprocessing effort, the new method outperformed the second best method by using no more than 40 percent distance computations per query, on a database of 10,000 gene sequences, compared to several best known k-nn search methods including M-Tree, OMNI, SA-Tree, and LAESA. We demonstrated the use of this method on two biological sequence data sets, one of which is for HIV-1 viral strain computational genotyping.
pubmed:language	eng
pubmed:journal	http://linkedlifedata.com/resource/pubmed/journal/101196755
pubmed:citationSubset	IM
pubmed:chemical	http://linkedlifedata.com/resource/pubmed/chemical/Proteins
pubmed:status	MEDLINE
pubmed:issn	1557-9964
pubmed:author	pubmed-author:CaiZhipengZ, pubmed-author:LinGuohuiG, pubmed-author:SanderJörgJ, pubmed-author:WangLushengL, pubmed-author:ZhouJianjunJ
pubmed:issnType	Electronic
pubmed:volume	7
pubmed:owner	NLM
pubmed:authorsComplete	Y
pubmed:pagination	669-80
pubmed:meshHeading	pubmed-meshheading:21030734-Computational Biology, pubmed-meshheading:21030734-Databases, Genetic, pubmed-meshheading:21030734-Databases, Protein, pubmed-meshheading:21030734-Proteins, pubmed-meshheading:21030734-Sequence Analysis, Protein
pubmed:articleTitle	Finding the nearest neighbors in biological databases using less distance computations.
pubmed:affiliation	Department of Computing Science, University of Alberta, Edmonton, Canada. jianjun@cs.ualberta.ca
pubmed:publicationType	Journal Article, Research Support, Non-U.S. Gov't