Dear all,
I'm recalculating the RPKM value of a RNASeq data on Rsubread through featureCounts function, and I'd like to know if should I use just the "assigned" reads or the total reads, including "unassigned ambiguity, multimapping..." (see below), in the RPKM formula. Looking for the answer in forums and in the Mortazaviet al.(2008), I've just find out that "N is the total number ofmappable reads in the experiment". So, could anybody please help in this regards?
RPKM = N/(L*T)
where:
N: number of reads assigned to a gene
L: length of the gene (kb)
T: total mapped reads (Millions)
T_reesei_F24.1_GGCTAC_L008_R1_001.cleanreads.fastq.gz_tophat2.F24h.1_accepted_hits.bam
Assigned 32270962
Unassigned_Ambiguity 6896
Unassigned_MultiMapping 116803
Unassigned_NoFeatures 10751746
Unassigned_Unmapped 0
Unassigned_MappingQuality 0
Unassigned_FragementLength 0
Unassigned_Chimera 0
Thanks in advance!
Well, RPKM is calculated with respect to total number of mapped reads.
If you are working on uniquely mapped reads on genome then you should only consider Assigned reads.
Thank you all! I really appreciated your answers!