I have strand specific RNASeq data. I have mapped sequences of negative strand on Negative strand bam file and positive strand sequences on positive strand bam file. I got the read counts. Now, I have to RPKM normalize it, but I am confused weather I should take the total reads of strand specific bam file or the total reads of both bam files?
RPKM is ok for within sample normalization, but if you have multiple conditions and want to compare them together, then you should use another normalization method.
That being said, for RPKM you should always normalize according to the total read originating from both strands as there is no reason to normalize separately the forward and minus strands. This will allow you to compare the level of expression of a gene with the level of expression of its antisense transcripts for instance.
It depends on what you want to do. If you show both strands on the same figure, then it will be definitely wrong. If you show only one strand, then may be it's ok but still not the best practice IMHO.
But if normalized based on strand-specific bam, will it be considered wrong?
It depends on what you want to do. If you show both strands on the same figure, then it will be definitely wrong. If you show only one strand, then may be it's ok but still not the best practice IMHO.