Comparison Htseq And Feature Count
2
2
Entering edit mode
10.7 years ago
HNK ▴ 150

Hey I have result of approx 50 samples from HTSeq and feature count. I want to compare the results of both tools that how close they are. I mean i need to do correaltion between them Or is there a way of getting an output (graph etc)that shows there relation.

htseq • 21k views
ADD COMMENT
0
Entering edit mode

To see how close or different the results are of both the tools

ADD REPLY
3
Entering edit mode
10.7 years ago

The featureCounts paper actually goes into some detail about when and why it will disagree with htseq-count (see section 5.2), so I'm not sure what further you're trying to achieve. If you really want, you might just group the counts into 3 categories: identical, htseq-count greater, featureCounts greater. You'll find the "identical" category contains most of the genes. For the differences, I think featureCounts generally performs in the correct way (or at least I've generally found Wei Shi's argument for why featureCounts works differently to be good).

Edit: The alternative method is exactly what they did in the featureCounts paper, which is to just make a Venn Diagram of the read assignment.

ADD COMMENT
2
Entering edit mode
10.7 years ago
dbpzdbpz ▴ 210

I think it is interesting to compare the two programs. FeatureCounts runs much faster and supports more input formats (including BAM files that are sorted by either read names or genomic coordinates). But it is also important to compare the read assignment results after all.

For single-end reads, the default setting of featureCounts should work exactly the same way as HTSeq-count does on the union mode, except that the annotation files (GTF or GFF) are parsed differently. HTSeq-count excludes the "end" location from the feature interval, but featureCounts includes the "end" location in the interval. I believe that featureCounts parses the annotation file in the correct way according to the GTF/GFF format specification:

http://genome.ucsc.edu/FAQ/FAQformat.html#format3

, where the "end" location is said to be inclusive.

In the paired-end mode, featureCounts does more than HTSeq-count by breaking the tie of ambiguous using votes. Each read in a fragment (read-pair) is a vote. If there is a feature that receives uniquely highest number of votes (either 1 or 2), this fragment is assigned to this feature without ambiguity.

ADD COMMENT
0
Entering edit mode

htseq-count supports coordinate-sorted BAM files. Also, all of these differences were covered in the featureCounts paper, so I'm again at a loss for what HNK hopes to gain by doing this yet again.

ADD REPLY
0
Entering edit mode

yeaahh thanks... just saw the feature count paper.. i will make a simple venn diagram in order to get the identical, htseq-count greater and featureCounts greater numbers for my dataset results :)

ADD REPLY

Login before adding your answer.

Traffic: 2673 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6