Hey all, I had a doubt regarding the output generated by featureCounts. So, what would be the difference in output between the following commands:
featureCounts -p -s 2 -F GTF -t exon -a annotation.gtf -o counts.txt sample.bam
featureCounts -p -s 2 -F GTF -t gene -a annotation.gtf -o counts.txt sample.bam
From what I understand, the second command will only read the lines in the GTF which have 'gene' in the 3rd column while the first command will count the exon lines. Also, the sum of the reads for all exons of a gene should be the same as the number of reads for the gene as a whole. So, should the output file have the same counts for both?
Thanks,
Ah, thanks for the nice explanation. I hadn't thought of the aspect of reads aligning to introns skewing results. So, ideally, for an RNAseq experiment, I would want to use
-t exon
, correct?Yes, definitely in most cases !
Thanks a lot, cheers!
PS: unrelated to your question, but usually we do not use the
-o
option because it is often better to discard read that can not be unambiguously assigned, at least for differential expression analysis.Yes absolulety! Even I dont use the
-O
option. The-o
I have used is just to define the output file.