How to find tandem duplications pattern in a DNA sequence
4
0
Entering edit mode
3.6 years ago
kumajis • 0

Hi,

I have a DNA sequence contains a tandem duplication (the sequence of duplication is unknown) for an unknown times.

Each subsequence has some mutations, suppose repeat unit is A (length is 1~8K ) then the sequence is A1A2A3....An(A1 and An possibly are not complete).

If I want to find out the duplication unit A, how should I do?

Thanks.

Repeat • 3.5k views
ADD COMMENT
0
Entering edit mode

If you are interested in finding exact number/sequence of the repeats then using a long-read sequencing technology would be the ideal way.

If you already have sequence data then please specify the kind.

ADD REPLY
0
Entering edit mode

Yes, I have some rolling amplified nanopore sequencing data and want to separate the repeated unit then generating a consensus single reads for a better accuracy.

ADD REPLY
0
Entering edit mode

some rolling amplified nanopore sequencing data

You should have included this information in the original post.

Even if ReDTandem is not supported you may be able to use it. If it is not meant to be used with long reads then that would be an exception.

Are you using a custom protocol to do this sequencing. What do you mean by "rolling amplified" data? AFAIK there is no rolling circle like sequencing in Nanopore as there is in PacBio.

ADD REPLY
0
Entering edit mode

Thanks,

ReDTandem can not be downloaded anymore since the author closed his website

“Rolling amplified” means the target DNA is amplified several time to be a tandem sequence, then separate these tandem copies after nanopore sequencing, I could take a consensus step to increase the target DNA sequence accuracy.

ADD REPLY
0
Entering edit mode

Hi,

actually, what I exactly wanted is a massive sequencing reads of rolling amplification sequence, is there any bioinformatic tools for is propose?

I find one named ReDTandem but the author have not supported it anymore.

Thanks

ADD REPLY
3
Entering edit mode
3.6 years ago

Tandem Repeats Finder may be of use:

  1. Click on "Submit a Sequence for Analysis" and select "Basic" (if you want to adjust search parameters, pick a different option).
  2. Click on the "Cut and paste sequence" radio button and paste in your sequence of interest.
  3. Click "Submit sequence" and wait a few moments (~10-15s).
  4. Click on the "Tandem Repeat Report" link to open a summary table of those repeats discovered within your window of interest.
  5. Click on the items in the "Indices" column to get more details on sequences, periodicity, content, etc.
ADD COMMENT
0
Entering edit mode

Hi,

Tandem Repeat Finder is good at finding STR, but could it pick out much longer repeat pattern like repeat duplicated genes?

Thanks

ADD REPLY
0
Entering edit mode

For repeats longer than 2k in length, you'll probably want to investigate other answers/tools.

ADD REPLY
2
Entering edit mode
3.6 years ago
cmdcolin ★ 4.0k

I don't have much experience in this area but you could probably look for tools ranging from CNV finder (e.g. finding a DUP overlapping a gene) to something more narrow like https://github.com/delehef/asgart or How to detect segmental duplications?

Finding DUP CNVs (e.g. increased read coverage overlapping a gene) is likely a great first step, as it would be one of the more obvious signals that you can pick up, but more hidden patterns could be revealed by a specific segmental duplication finder

ADD COMMENT
0
Entering edit mode

Thanks, but Dup or CNV analysis tools seem too complexed for my goal.

ADD REPLY
1
Entering edit mode
3.6 years ago

If it's just contig analysis (by that I mean, you're looking at single sequences, and don't want to massively scale), the simplest approach is probably doing a dotplot or even blast2sequences on NCBI Blast with graphical output.

Compare the alignment patterns to find hints about repetition/ duplication. You can play with parameters to get eg 80% or 95% identity alignments.

ADD COMMENT
0
Entering edit mode

this is probably a better answer than mine if it is just sequence vs sequence comparison!

ADD REPLY
0
Entering edit mode

Hi,

actually, what I exactly wanted is a massive sequencing reads of rolling amplification sequence, is there any bioinformatic tools for is propose?

I find one named ReDTandem but the author have not supported it anymore.

Thanks

ADD REPLY
1
Entering edit mode
6 months ago
micah ▴ 30

I built a web application can directly find repeat unit and repeat times, try it at http://64.64.240.35:8050/. Dot plot

5 tandem duplications follow with a inverted duplication

ADD COMMENT

Login before adding your answer.

Traffic: 1969 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6