Question

RepeatMasker overlap and interpretation

0

Entering edit mode

4.7 years ago

Picasa ▴ 650

Dear all,

I have run RepeatMasker and I have this kind of result:

*out file

   SW   perc perc perc  query                 position in query    matching         repeat          position in repeat
score   div. del. ins.  sequence              begin end   (left)   repeat           class/family  begin  end    (left)  ID

  428    7.3 22.8  0.0  ctg1371    230   365 (1868) C rnd-1_family-52  DNA/Maverick  (6794)    181     15   1  
  381   14.8 19.9  1.7  ctg1371    232   382 (1851) C rnd-1_family-50  Unknown        (938)    178      1   2 *

I don't understand why I have 2 different repeat classification and big overlap between these 2.

Is there any filtering to do ? I mean is it possible that one is more wrong than the other, and if yes based on what.

Thanks for your answers.

repeat overlap • 1.6k views

ADD COMMENT • link updated 4.7 years ago by lieven.sterck 15k • written 4.7 years ago by Picasa ▴ 650

score 0 · Answer 1 · 2020-03-20

0

Entering edit mode

4.7 years ago

lieven.sterck 15k

From what I can see from that output it does not seem there is a large overlap (~180 bases, no? ~~out of 1900~~ ).

Also the classification of repeats by RM is not super strict, from the 1900 bases the majority can be quite different causing those two classes not to be catalogued as 1 family. On the other hand, many repeat classes share a substantial part of their content (eg. integrases/RNA polymerases/ ...) so it is not super surprising that they will share some similarity to each other.

ADD COMMENT • link 4.7 years ago by lieven.sterck 15k

0

Entering edit mode

Thanks for your answer, it's more clear.

However, sorry but I am not familiar with this output but how did you calculate 1900 bp ?

I have looked at the sequences rnd-1_family-52#DNA/Maverick and rnd-1_family-50#Unknown generated by RepeatModeler and their size are 6975 bp and 1116 bp respectively.

ADD REPLY • link 4.7 years ago by Picasa ▴ 650

0

Entering edit mode

yeah, my bad ... was looking at the wrong column, you're indeed correct in respect to their length

ADD REPLY • link 4.7 years ago by lieven.sterck 15k