Hello,
I am totally new in TE analysis (and bioinformatics in general). I used repeatmasker in three closely related species and now I would like to find all the repeated elements common to all the species. My .out files look like this:
SW perc perc perc query position in query matching repeat position in repeat
score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID
11220 0.0 0.0 0.0 ptg000001l 1 9492 (18167109) + (CTAAC)n Simple_repeat 1 9491 (0) 1
1524 25.8 1.9 2.1 ptg000001l 9493 10027 (18166574) + rnd-1_family-70 Unknown 110 643 (0) 2
6766 6.2 0.3 3.2 ptg000001l 10032 10922 (18165679) + rnd-5_family-818 Unknown 1 866 (0) 3
5127 5.1 0.0 0.0 ptg000001l 10924 11546 (18165055) + rnd-1_family-464 Unknown 1 623 (0) 4
2991 13.3 3.2 5.9 ptg000001l 11547 12175 (18164426) + rnd-6_family-2133 LINE/R1 1 613 (2635) 5
I was planning to use bedtools to do the intersection of columns 9 and 10 for the three species, but I do not know if that would be the correct way to do it or if there are other tools that could be more convenient?
Thank you very much!
The solution you propose requires that the genomes have exact (or very similar) coordinates for the annotations. That requirement may or may not be true, it all depends on what "closely related" means in your context.