I have the read file and assembled contigs (both in .fasta format). I need to calculate the coverage for each single contigs across the read file. I tried doing (1) indexing with bowtie (2) alignment with both with bowtie-align and tophat2. its giving me the error "Splice sequence indexing failed with err =1". Kindly help me how to proceed with the coverage calculation of each contig.
Hi Jyotika,
Reads in fasta? Are you sure its not fastq rather?
This does not make sense . What do you mean by across the read file? Elaborate.
You need to tell us the exact commands you ran.
Thanks
Vijay
Command that I run was tophat2 -r 20 454.10species.fasta MC55.MG10.AS1.C1.fasta
mine is a metagenome data. I need to get the coverage/depth of every single contig in my .fasta file. The file MC55.MG10.AS1.C1.fasta is having only one sequence, likewise I have 10000 contigs for which I need the coverage/depth. 454.10species.fasta is my metagenome file file after sequencing.
I am not sure why you are using
tophat
for this analysis. It may be simpler to usebwa
orbbmap
(from BBMap suite) and get alignments of your reads against the assembly. You would probably want to place multi-mapping reads in a single random location. Finally follow that up by usingmosdepth
(download, to get single base level coverage) orsamtools idxstats
analysis to get counts per contig.Note: Having a single fasta sequence per file is going to make this ridiculously clumsy. Consider concatenating original reads in a single multi-fasta file. If you have original fastq format reads available then I would rather use those.