HTSeq count versus summarizeOverlaps, mismathc of exon counts
1
0
Entering edit mode
8.1 years ago
Gama313 ▴ 130

Hello to everybody

I am a newbie and I've recently start analysing RNASeq data. I used respectively:

  • HTSeqcount:
    • dexseq_annotation.py;
    • dexseq_count.py;
  • summarizeOverlaps:
    • exonicParts= (txdb, aggregateGenes=FALSE);
    • se=summarizeOverlaps(exonicParts,bamfiles,mode="Union", ignore.strand=TRUE, singleEnd=TRUE, fragments=FALSE, inter.feature=FALSE)

I did it in order to obtain counts at the exon level as input of DEXSeq for assessing the exon usage. The problem is that I got very different counts in dependence of which program I used: in particular, summarizeoverlaps counts as uniquely mapped reads too many of them (I got 18.000.000 total unique reads for HTSeq vs 42.000.000 apparently unique reads for summarizeOverlaps). Could somebody explin me why it happens? I will thank you in advice

Filippo

RNA-Seq • 1.9k views
ADD COMMENT
0
Entering edit mode

I think I might help you but could you please reformat your message in such a way that the list is better visible?

ADD REPLY
0
Entering edit mode
8.1 years ago

Do not reinvent the wheel by using R, as you've noticed you're likely to get the wrong results. The DEXSeq scripts are known to produce the correct results, do not use anything else.

The reason you get different results in R is because you're counting different alignments in a different manner.

ADD COMMENT

Login before adding your answer.

Traffic: 1836 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6