Are there any reference alignments for human Alu families or subfamilies that someone has pre-computed?
Are there any reference alignments for human Alu families or subfamilies that someone has pre-computed?
See ftp://ftp.ncbi.nih.gov/repository/repbase/ALU-ALN/:
Each file "aln." and "sqz." contains the multiple alignment of an Alu sub-family, as denoted by the file's extension. The sequences are in Stanford/IG format. The first sequence in each file is the consensus sequence for that sub-family. Each sequence is aligned relative to the subfamily consensus using Smith-Waterman algorithm. The multiple alignment is constructed from pairwise alignments. In the files "aln.", all insertions relative to the consensus sequences are included. By comparison, the files "sqz." contain "squeezed" sequences. In other words, insertions relative to the consensus sequence are removed from each aligned Alu sequence.
You want reference seuqences for Alu families? Did you already look at Repbase? Unfortunately registration is required. There you can extract species specific repeats (+common ancestors optionally).
I personally prefer the repeatmasker edition. you have ids (and sequences which I don't post here) like: AluJb, AluJb_short_ , AluJo...
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.