I was working with featureCounts and had three runs while changing only one parameter: the strand specificity. I would expect that the reads from -s 1
and the reads from -s 2
would sum up to the the number of reads outputted by -s 0
.
featureCounts --minReadOverlap 50 -s 0 -Q 1 -T 12 -a matrix_gene.saf -F SAF -O -o $OUT/total_counts.txt file.bam
featureCounts --minReadOverlap 50 -s 1 -Q 1 -T 12 -a matrix_gene.saf -F SAF -O -o $OUT/total_counts.txt file.bam
featureCounts --minReadOverlap 50 -s 2 -Q 1 -T 12 -a matrix_gene.saf -F SAF -O -o $OUT/total_counts.txt file.bam
However, the results differ:
==> minReadOverlap50_strand1/total_counts.txt.summary <==
Status WEN1_6558.sort.bam WNN1_6545.sort.bam WNN2_6550.sort.bam
Assigned 4643945 8863560 8859072
==> minReadOverlap50_strand2/total_counts.txt.summary <==
Status WEN1_6558.sort.bam WNN1_6545.sort.bam WNN2_6550.sort.bam
Assigned 4775184 9123446 9158397
==> minReadOverlap50/total_counts.txt.summary <==
Status WEN1_6558.sort.bam WNN1_6545.sort.bam WNN2_6550.sort.bam
Assigned 8356549 15881043 15933323
4643945+4775184=9419129 != 8356549
Why don't the reads sum up?
I have ChIP-seq data, so I have "+" and "-" reads. I have regions where I want to count my reads. Regions can overlap and can be on "+" and on "-" strands. I do not care about the strands, meaning if any read falls into a feature on "-" strand, it should be counted, if any read falls into a feature on "+" strand, it should be counted. Does it mean that I have to use -s0 and not care about strand specificity?
Exactly ! -s1 or -s2 are mostly used with strand-specific RNA-seq data. With ChIP-seq data you'll usually use -s0.
One small confusion is there, I am using paired-end RNA-seq data and for quantification, Platform for my sequencing was 'Illumina NextSeq 500 (Homo sapiens)' I have used the following command
I want to see counts of both the strand (reverse and forward) in my output, Which option I am supposed to add in my command -s 0 or -s 1 or -s 2 option?
If you want to count on both strand, use -s 0 (which is the default behaviour).