Check if chromosome notation is the same between BAM and GTF. A common error is that in BAM it is like 1, 2, 3 and in GTF it is like chr1, chr2, chr3. As a remark, tophat is deprecated for years now. You might get better results if you use more recent algnment tools such as hisat2 or even tools like kallisto and salmon to irectly quantify your reads against a transcriptome. Based on recent literature these leightweight quantifiers (salmon/kallisto) seem to be superior for RNA-seq quantification compared to traditional aligners.
it seems that you aligned your reads on some reference other than genome, the RNAME is KB317696.1 not chromosome. And you want to count reads on each exon, your GTF coordinates are coordinate for a gene
the pb is HTSeq-count does not work in this way, you need to align reads on genome then feed HTSeq the coordinate of each gene/transcript/exon in your GTF file, HTSeq need information of both gene and exon to deal with splicing event.
FYI: if you want to count reads on exon, just filter your sam line, group reads by : 1. same exon. 2. different exon. Then count these lines.
Check if chromosome notation is the same between BAM and GTF. A common error is that in BAM it is like
1, 2, 3
and in GTF it is likechr1, chr2, chr3
. As a remark, tophat is deprecated for years now. You might get better results if you use more recent algnment tools such ashisat2
or even tools likekallisto
andsalmon
to irectly quantify your reads against a transcriptome. Based on recent literature these leightweight quantifiers (salmon/kallisto) seem to be superior for RNA-seq quantification compared to traditional aligners.Now I have used STAR to map the reads against the reference genome. The output is Aligned.sortedbycoord.out.bam and Alignedout.bam
needs more information for debug. If your bam file is OK, then it is probably something wrong with --idattr locus_tag and your off file
boaty, kindly look at the above stanza.