About Repeatmasker
1
2
Entering edit mode
13.1 years ago
Zhshqzyc ▴ 520

Hello,

I have a nucleotide sequence(fasta format) size limit of 20 kb. And also I have my own genome sequence file(repeat database, also fasta format, 200GB) on my local machine. I want to identify repetitive elements in genome sequence.

Questions: 1)Which software is best? I heard about RepeatMasker. 2)If RepeatMasker will be used, what kind of format for repeat library? I mean do I convert fasta format to some sort of format? 3)What is low-complexity DNA sequences and interspersed repeats?(off topic of course, you don't have to answer it)

Thanks.

sequence fasta software • 6.3k views
ADD COMMENT
0
Entering edit mode

please read the repeat masker help pages and see if this answers your question: http://www.repeatmasker.org/webrepeatmaskerhelp.html

ADD REPLY
0
Entering edit mode

Partly, I still don't know the format of Reference repeat databases. Is a huge fasta file okay?

ADD REPLY
4
Entering edit mode
13.1 years ago
Elizabeth ▴ 40

Hi,

First of all, with what purpose do you want to identify repetitive elements in your genome sequence?

If you are just interested in masking them from the genome, I would use RepeatMasker with a repeat database from RepBase www.girinst.org, yes fasta format is OK) to mask transposable elements by similarity to already described transposons. To identify tandem repeats (typically minisatellites, repeated motifs of 20-50 nuclotides) you can use TRF (http://tandem.bu.edu/trf/trf.html). TANTAN also identifies tandem repeats, and low-complexity sequences as well (ATATAT for example) (http://www.cbrc.jp/tantan/)

If what you are interested in is identifying and classifying transposable elements in your genome, there are various tools to identify different types of transposons, but that calls for a longer mail... let me know if its the case.

Cheers,

Elizabeth

ADD COMMENT

Login before adding your answer.

Traffic: 2121 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6