Using RepeatModeller and RepeatMasker for Multiple Gastropod Genomes: Is a Single Repeat Library Sufficient?
1
0
Entering edit mode
23 days ago
Rohan ▴ 20

Hello everyone,

I'm currently working on the genomes of 30 taxa within Gastropoda, and I have the complete genome sequences for each species. My aim is to functionally annotate each genome. The pipeline I'm using involves:

  1. Running RepeatModeller to build a custom repeat library.
  2. Using RepeatMasker to mask repetitive elements across each genome.
  3. Proceeding with BRAKER3 for gene prediction after masking.

I have a question about optimizing this process:

Do I need to generate a RepeatModeller library for each species individually? Since all taxa belong to Gastropoda, would creating a custom repeat library for each species give significant benefits over building a library from one or a subset of these genomes? My concern is about computational time and redundancy in repeats that might be highly similar across these taxa.

Any insights or suggestions on whether I should stick to one model or customize for each species would be greatly appreciated. Thank you!

genome-annotation RepeatModeller RepeatMasker • 303 views
ADD COMMENT
2
Entering edit mode
23 days ago
Mensur Dlakic ★ 28k

Since all taxa belong to Gastropoda, would creating a custom repeat library for each species give significant benefits over building a library from one or a subset of these genomes?

It depends on whether you are interested in doing this fast or doing it well. I would do them individually unless they were near-identical strains of the same species.

My concern is about computational time and redundancy in repeats that might be highly similar across these taxa.

You have information that you didn't share with us: How similar are these groups? I would imagine that genomic sequences with > 80-90% sequence identity are likely to have near-identical repeats. Still, there may be repeats that are unique only to some groups, so I would still do what I proposed above.

ADD COMMENT
0
Entering edit mode

Thank you for the insights! These taxa are indeed entirely different species within the same family, so it's very likely that their genomes share less than 80-90% identity. Given this, I’ll proceed with creating custom repeat libraries for each species.

ADD REPLY

Login before adding your answer.

Traffic: 2615 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6