Compare repeat elelemts among species
0
0
Entering edit mode
7.3 years ago
qwzhang0601 ▴ 80

Dear all:

I plan to do comparison analysis on repeat elements among several rodents. For some rodents (e.g., mouse, rat) their annotation information are directly available from RepeatMasker website (e.g., for mouse http://www.repeatmasker.org/species/musMus.html). But for the other rodents we can not get such annotation and I have to run repeatmasker by myself. I have some concerns about how to do the comparison and have two backup choices. Does anyone can give me some suggestions on that? Thanks

Choice 1: To maker sure all the annotations are achieved under the same condition (e.g., the repeat library, the parameters), I can do all the repeat annotation by myself and then do comparison. Choice 2: For mouse and rat I use the annotation file from RepeatMasker website because such annotation should be standard. For other rodents I will try to annotate them as good as possible (for example, predict species-specific repeat elements). And then compare.
Which one do you think make more sense?

By the way, is there some package (codes) that can help to analyze the annotation file (see example below), and get the percentage of each kind of repeat element in the genome?

more mm10.fa.out

SW  perc perc perc  query      position in query           matching       repeat              position in  repeat
score  div. del. ins.  sequence    begin     end    (left)    repeat         class/family         begin  end (left)   ID

14737   8.1  1.0  0.2  chr1      3000001 3000097 (192471874) C  L1MdFanc_I     LINE/L1             (2987) 3586   3489      1

 27   0.0  0.0  0.0  chr1      3000098 3000123 (192471848) +  (T)n           Simple_repeat            1   26    (0)      2

14737   8.1  1.0  0.2  chr1      3000124 3002128 (192469843) C  L1MdFanc_I     LINE/L1             (3085) 3488   1467      1
repeatmasker repeats • 1.5k views
ADD COMMENT
0
Entering edit mode

Hi! The example file is hard to read for me. Maybe you can format it somehow? Regarding your choices, I suggest doing a mix. Doing everything with the same approach is suggested, otherwise you will never knwo if the differences are real or due to different analysis options. However, assuming that the available annotation can be considered a gold standard, and that your data should theoretically enable you to obtain such a good annotation, you can perform the masking by yourself on all organisms, and use mouse and rat available data to find the optimal parameters of the masking.

ADD REPLY

Login before adding your answer.

Traffic: 2685 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6