Dear all,
I was trying to map the RNA seq data on the assembled genome. I was using Tophat tophat-2.0.10.Linux_x86_64
and Bowtie2 bowtie2-2.1.0
. When I tried to input the index of repeat masked genome (Masking was done by RepeatModeler; Genome size 78Mb; Repeats: 42%). Tophat2 is giving me following error message:
bowtie2-inspect SCa_gtr_500_discarded_90_percent_Ns_ID_renamed.fasta.masked.indexassert_eq: expected (1816, 0x718) got (1536, 0x600)
bt2_inspect.cpp:218
bowtie2-inspect: bt2_inspect.cpp:218: void print_ref_sequences(std::ostream&, bool, const EList<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, 128>&, const uint32_t*, const std::string&): Assertion 0' failed.
Aborted
But when I did the same thing on the unmasked genome, it is running fine now. My questions are,
a). Is this error is usual with all repeat-masked genomes?
b). I need to predict genes using Reference based RNA-Seq assembly, so should I really do reads mapping on the repeat-masked genome?
c). I am interested in finding genes on the repeat-masked genome, how can we fix this problem?
I would really appreciate your comments on this!
Best regards, Rahul
Is the result of repeat masking that there are some contigs with only Ns? If you
Yes, I filtered out those scaffolds which were having only Ns. Now it is running fine. But
Bowtie2
should also print it as an error message.