pubmed:abstractText |
A statistical analysis of a set of genomic human Alu elements is based on a published alignment and a recent classification of these sequences. After separation of the Alu sequences into families, the consensus sequences of these families are determined, using the correct weighting of the unidirectional decay of CG-dinucleotides. For, the tenfold greater mutation rate at CG's requires separate consideration of an independent clock at every stage of analysis. The distributions of the substitutions with respect to the new consensus sequences, taking the CG and the non-CG-nucleotide positions separately, lie far closer to the expected distributions than the total diversity. Computer analysis of the folding of RNAs derived from these sequences indicates that RNA secondary structure is conserved among Alu families, suggesting its importance for Alu proliferation and/or function. The folding pattern, further substantiated by a number of compensatory mutations, includes secondary structure domains which are homologous to those observed in 7SL RNA and a defined region of interaction between the two Alu subunits. These results are consistent with a model in which a small number of conserved Alu master genes give rise via retroposition to the numerous copies of Alu pseudogenes, that then diversify by random substitution. The master genes appeared at different periods during evolution giving rise to different families of Alu sequences.
|