Which Tools do you use for (Tandem) Repeat Detection? Why?
2
0
Entering edit mode
9.5 years ago
voidnyx ▴ 10

Hi,

I recently started my work for a university project where I want to analyze repetitive sequences in genomic DNA of two closely related species. I have no prior experience with this and have read some papers to start with.

I found there are a handful of programs for Tandem Repeat detection like TRF, Mreps, ATRhunter, RepeatMasker and so on. Based on different approaches and different opinions I could not really see a consensus in the articles I read on which program to use, not even on distinct (dis)advantages.

What I took away from them was, you have to be extra careful with your parameter settings for most of the programs and basically have to be author of the program to actually understand whats happening.

So I would like to hear opinions from people who actually use some kind of Repeat Detection software on this topic or have experience with the matter. Which programs you think are good? and where do you see their advantages or disadvantages?

PS: I am not sure this is the right place for this kind of "open" question so feel free to move it

Tandem-Repeat genome Repeat-Detection • 4.4k views
ADD COMMENT
1
Entering edit mode

Use TRF (this is my subjective experience and objective number of citations from the Pubmed):

  • TRF - 929
  • Mreps - 70
  • ATRhunter - 12
  • RepeatMasker uses TRF
ADD REPLY
1
Entering edit mode

The definition of tandem repeat itself is dark magic - how many times a motif repeats itself? how many mismatches/gaps we allow? I guess it is hard to really judge what tool is clearly more accurate than others. Just stick with what everyone uses - TRF. UCSC shows their setting, which you can copy. If have run RepeatMasker, you can take its output.

ADD REPLY
3
Entering edit mode
9.3 years ago
Elke Schaper ▴ 110

I've worked quite a bit with tandem repeats for my Ph.D. In the very beginning, I learned the same as you're writing: the tool & combined with the chosen parameters, has a large influence on the result.

Check out mine and others publications on the topic:

Repeat or not repeat?- Statistical validation of tandem repeat prediction in genomic sequences

Detecting short tandem repeats from genome data: opening the software black box

What I've learned from all this is:

  • There is useful, and there is less useful tandem repeat detection tools. TRF might be well-cited, but it suffers from very low sensitivity, and blackbox code.
  • It is often useful to collect data from several tandem repeat detectors, to make sure your sensitivity is sufficient.
  • For protein tandem repeat detection tools, it is necessary to perform a statistical test on each proposed tandem repeat to control for false positive annotations. I'm not sure whether this is true for genomic tandem repeats also, but I wouldn't be surprised, again given the black box character of many of the tools.

However, if you're lucky, you're only interested in non-diverged or perfect tandem repeats. In this case, the detection task is simple, and one tool (but in my experience not TRF) will do, and no additional testing for false positives is necessary.

Good luck!

ADD COMMENT
0
Entering edit mode

Could you name the tool you would recommend then?

ADD REPLY
0
Entering edit mode
9.5 years ago
h.mon 35k

I am experimenting with MISA now (210 PubMed citations), because it already has some scripts to design primers from its output. So far, I am happy, but I haven't tried any PCR yet.

ADD COMMENT
0
Entering edit mode

Hi..h.mon. Can u tell me how do u convert MISA generated files to primer3 input file. I used primer3_in.pl perl program. But it is showing a typical problem "use of ininitialized value $count in concatenation or string in primer3_in.pl line 34 <SRC> Chunk 22205" please kindly help me how can I get rid of this problem.

ADD REPLY
0
Entering edit mode

Which version of Primer3? MISA is quite old and needs Primer3 version 1 series, it won't work with version 2 series.

ADD REPLY
0
Entering edit mode
Yes that's right. But P3_in.pl can not give any output in my case. The output of P3_in.pl will be input for primer 3. I have both versions of primer 3 anyway
ADD REPLY

Login before adding your answer.

Traffic: 2092 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6