Hi! everyone:
I'm doing read count and found something strange:
I used RSeQC
infer_experiment.py
and got the result below which shows that my data is definitely reversely stranded
So, I used featureCount
-s 2
for count and found a low assign rate:
So, I try-s 0
-s 1
for comparison and found both show high assign rate:
-s 0
:
-s 1
:
- Here I also show the STAR mapping result:
(after mapping, I used samtools for filteringsamtools view -f 2 -F 256
)
Question:
1. Is there any wrong with my argument setting?
2. Despite the -s comparison, why featureCounts always got low assigned rate? Why the total reads in featureCounts are so many? And even after I filtered multiple mapping reads using samtools before, I got so many multiple assigned showed in featureCounts' summary?
Thank you a lot for your reading and suggestions!
STAR has a
--quantMode GeneCounts
parameter, which should output counts using the same method as HTSeq count. And why do you filter the bam file before counting?It may be due to poor annotation. What is the organism? What are the genome and annotation versions?
Are you telling featureCounts you have paired reads?
Overlapping features?
Thank you for your kindly reply! h.mon
I normally remove multiple mapped reads after mapping using
samtools
, because I never use these reads for downstream analysis, I think when using HTSeq we will also discard these.What is the organism? What are the genome and annotation versions?
I used mm10 reference genome from ensembl and its corresponding GTF annotation.
I think the low assigned rate is due to these reads that fall into intronic and intergenic. And when I using un-stranded mode, it counts twice for a read.
Thank you again!