I used tophat to map my RNAseq reads to human genome hg19 (My data is single-end) and I got some files in the tophat output folder. I need to know how many reads mapped to the reference genome and how many don't mapped and the percentage of overall alignment.
BTW, I used the following samtools command
samtools flagstat accepted_hits.bam
and It gave me the following results:
175638490 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
175638490 + 0 mapped (100.00%:-nan%)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (-nan%:-nan%)
0 + 0 with itself and mate mapped
0 + 0 singletons (-nan%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
Any explanation of the zeros above
Thanks Martombo. Could you please tell me when I can find align_summary.txt.
it should be in the output folder of tophat
Hi Martombo, Could you please tell me where this file exactly located becuse I didn't find it in my output folder.