Repeat Modeler Database Generation
0
0
Entering edit mode
4 weeks ago
SomeOne ▴ 170

HI, Posting again as last post didn't get any comments.

I have some fungal genomes sequenced wil illumina short-read. a handful of them are also sequenced with nanopore. All samples belong to one Specie Complex.

I have generated

  1. nanopore assemblies with flye and after polishing steps and filtered out small contigs (removed contigs < 5000bp)
  2. Illumina samples were assambled with SPAdes v4.0 then QC and all assembly stats checked.

NOw i want to perform Repeat annotation before going for the genomic annotation.

As i remember, Repeat Modeler is used to generate a de-novo database from the query genome and then Repeat Masker is used to ascually mask the fasta file. Correct me if i am wrong.

My question is

  • Should i merge all my Nanopore based assemblies in ONE-BIG.fasta file and use that for de-novo repeat annotation database generation with Repead-Modeler ? Then Individually mask each of the nanopore.fasta assemblies ? And for SPAdes assemblies, do the same (merge all fasta -> Annotate -> Mask each individual fasta)

  • Second way that comes to my mind is to MERGE all Nanopore.fasta and Spades.fasta genomes into a ONE-REALLY-BIG.fasta and then use Repeat-Modeler to generate De-novo annotation database, then mask repeats in all fasta genomes individually using this database.

Will merhing of these different sappemblies create any biasness or issue with my genome assemblies? Technically or Biologically ? AGain, all samples belong to onse Specie-Complex.

KIndly share your views about this. THanks.

repeat_modeler repeats fungi annotation genomes • 200 views
ADD COMMENT

Login before adding your answer.

Traffic: 1535 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6