Hi All!!
While aligning RNA-Seq data and quantifying the actual number of reads mapped to my transcriptome, I realised that samtools flagstat does not report the actual alignied reads stats. I used RSEM for my alignment and expression calculation. I am aware the samtools counts secondary alignments thereby increasing total number of reads that pass QC and therefore influence the alignment percentage that it throws as the output. Source here
Is there any other tool that can precisely give the actual number of reads correctly mapping in pairs to the transcriptome? Somewhat like the log.out file in STAR does.
Hi Devon. Used the above command and I am getting the following stats
However, from the FastQC report that I have, I know my sample has 88.5 million reads (each in R1 and R2). This 2 x 88.5 million should be equal to the number of reads I get in QC Passed reads ~177 million. How should I account for the rest of the ~74 million reads?
They're not in the BAM file, so likely you excluded them elsewhere.
Thanks @Devon
Found a way around. I revisited the RSEM source code and read it. I found that the Log.final.out file is generated in the sample_temp folder which is deleted at the end. Using --keep-intermediate-files we can preserve the alignment stats and other logs too.