Workflow for annotating repeat elements
0
2
Entering edit mode
7.3 years ago
amy.bashir ▴ 110

Hello everyone!

I am doing repeat elements annotation for a new genome. From what I read online and in papers, the work flow is 1) Dustmasker, 2) Trf, 3) RepeatModeler and 4) RepeatMasker.

I just finished masking the low complexity regions using Dustmasker. Should I use the hard masked file as input for Trf, or use the original genome sequence file as input? Is it ever a good idea to use a masked sequence as input for another repeat masking program?

Thank you very much!

Repeat elements • 3.1k views
ADD COMMENT
0
Entering edit mode

RepeatModeler uses Tandem Repeat Finder. Why are you using it prior to RepeatModeler? But I would wouldn't mask my data. Masked data means that repetitive terms are hidden away.

ADD REPLY
0
Entering edit mode

I saw that RepeatModeler uses Trf, but I got 0% for "simple repeats" and "low complexity", so I wondered if I do Trf analysis separately, I might see something different.

I am doing the repeat element annotation to see what percentage of the genome is repeat sequences, not to mask the sequence.

Also, it seems that most of the new whole genome analysis papers that I have come across use both RepeatModeler+RepeatMasker and Trf, so I was wondering if they do different things.

ADD REPLY
0
Entering edit mode

i used following workflow to annotation of repetitive elements in my own work one new genome:

./BuildModeler -name your_desired_name input_genome.fa

./RepeatModeler -engine ncbi -pa 15 -database your_desired_name

./RepeatMasker -pa 16 -gff -xsmall -lib /path/to/conseni.fa.classified input_genome.fa -dir /path/to/output
ADD REPLY

Login before adding your answer.

Traffic: 2776 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6