htseq-count -s reverse -f bam A.bam human.gtf > htseq.A.count.txt &
./featureCounts -p --countReadPairs -a human.gtf -o A.featurecount.txt -T 4 -s 2 A.bam &
I am analyzing RNA-Seq data using HTSeq-count and FeatureCounts, and I've found that the fragment quantification results from FeatureCounts are about half of those from HTSeq-count. This observation doesn't align with the paper , which suggests that HTSeq-count results are generally slightly lower than those from FeatureCounts. I am confused.
Maybe someone know the reason or have ways to trace the problem?
htseq-count result (show a few randomly)
ENSG00000284681 274 ENSG00000284634 135 ENSG00000284526 139 ENSG00000284431 8 ENSG00000198804 608623
featureCounts result
ENSG00000284681 172 ENSG00000284634 97 ENSG00000284526 82 ENSG00000284431 9 ENSG00000198804 307752
sorry I don't know how to show the table properly at here.
If you are not sure about the best method to count reads or if you want to compare methods, I would pick some gene IDs with small counts differences between methods on a number of countable reads (like ENSG00000284431, one can count 10 reads by eye) and check your bam file on IGV for this specific gene ID to get an idea of how the counting could have been done. Maybe you will see that 1 read is overlapping 2 genes that could be discarded in one method but not the other.
Thank you for your help. I'll definitely try the method you suggested. I was just a bit puzzled because I had initially thought that htseq-count defaults to counting reads rather than read pairs (fragments). However, I found in the Q&A section of htseq-count that it states that for paired-end data, it actually defaults to counting fragments.
https://htseq.readthedocs.io/en/master/htseqcount.html
I choose one of the gene where featureCounts counts as 2 but htseq-count counts as 0. But... I don't know how to read the number of fragments from the screen .... (oh no.....
These tools will account for fragments at the moment you define paired-end reads as input, as the 2 reads of the same fragments are sequencing the same molecule.
Which gene are we talking about ? You have to check where the fragments start and end to understand why one method count them in or out