I have 6,673,385 reads in each pair end file after quality filtering. but when i map it using tophat and run i run samtools flagstat on bam file it gives following output
1343686 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
1343686 + 0 mapped (100.00%:-nan%)
1343686 + 0 paired in sequencing
670808 + 0 read1
672878 + 0 read2
1203600 + 0 properly paired (89.57%:-nan%)
1311198 + 0 with itself and mate mapped
32488 + 0 singletons (2.42%:-nan%)
15874 + 0 with mate mapped to a different chr
452 + 0 with mate mapped to a different chr (mapQ>=5)
I am not very sure how to interpret samtools flagstat output, but as i assume there are only 670808 reads in pair1 are mapped 672878 from pair2. is it correct? That is 1/10 th of the total input reads. where are rest of my reads???
Report produced by tophat shows some other statistics
Left reads:
Input : 468668
Mapped : 443344 (94.6% of input)
of these: 216780 (48.9%) have multiple alignments (1 have >20)
Right reads:
Input : 468668
Mapped : 444468 (94.8% of input)
of these: 217699 (49.0%) have multiple alignments (1 have >20)
94.7% overall read mapping rate.
Aligned pairs: 433356
of these: 211726 (48.9%) have multiple alignments
5512 ( 1.3%) are discordant alignments
91.3% concordant pair alignment rate.
how to interpret all these numbers thank you.