Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:dateCreated
1990-4-26
pubmed:abstractText
The method presented here is intended as a compromise between finding a good overall alignment and the time taken to do so. Many multiple alignment algorithms spend an excessively large amount of effort trying to find the best global alignment. This time is often ill spent because the results of the standard dynamic programming alignment algorithm are dominated by the choice of gap penalty and the form of the score matrix, both of which have a poor theoretical foundation. Nonetheless, it is important that savings in time do not compromise the quality of the alignment. By using the consensus sequence approach, this danger is largely avoided as the conserved features of the sequences are quickly identified and preserved through further cycles. In the alignment of existing alignments, which is one of the more novel aspects of the method, each alignment was treated as an averaged consensus sequence with gaps making no contribution. This gives rise to the advantageous property that gaps will have a greater propensity to be inserted where there are already gaps and is equivalent to a local change in the gap penalty. This type of behavior represents a transition away from the homogeneous scoring schemes used in aligning two sequences toward a scoring scheme that depends on position in the sequence. The alignment of consensus sequences thus forms a bridge between simple pair alignment and the alignment of discrete patterns in which sequence features and allowed gap locations are exaggerated. To complete this transition the program described above has been integrated into the earlier pattern matching (template) program. Such templates can reliably locate sequence similarities that are too weak or scattered to be found by the more standard alignment methods and should therefore produce a further condensation of the sequence data bank. Only by continually extending our knowledge of the relationships between sequences to increasingly distant similarities can we hope to avoid being overwhelmed by the increasing amount of data.
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:chemical
pubmed:status
MEDLINE
pubmed:issn
0076-6879
pubmed:author
pubmed:issnType
Print
pubmed:volume
183
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
456-74
pubmed:dateRevised
2006-11-15
pubmed:meshHeading
pubmed:year
1990
pubmed:articleTitle
Hierarchical method to align large numbers of biological sequences.
pubmed:publicationType
Journal Article, Comparative Study