I have a data set of paired-end samples which I'm mapping with bowtie2.
I am mainly focus on the unmapped reads, as these are the reads we're interested in.
This is a sample output of the mapping results from on of my runs:
11216394 reads; of these:
11216394 (100.00%) were paired; of these:
8079466 (72.03%) aligned concordantly 0 times
2987422 (26.63%) aligned concordantly exactly 1 time
149506 (1.33%) aligned concordantly >1 times
----
8079466 pairs aligned concordantly 0 times; of these:
417642 (5.17%) aligned discordantly 1 time
----
7661824 pairs aligned 0 times concordantly or discordantly; of these:
15323648 mates make up the pairs; of these:
15226093 (99.36%) aligned 0 times
63924 (0.42%) aligned exactly 1 time
33631 (0.22%) aligned >1 times
32.13% overall alignment rate
Out of this output, how many unmapped reads do I have?
From my understanding I was thinking it is the fourth row from the bottom = 15226093?
If I calculate all three options in this summary i have aligned once - (29874222) + (417642 *2) + 63924 = 6874052 aligned >1 - (1495062) + 33631 = 332643 unaligned - 15226093 sum = 6874052 + 332643 + 15226093 = 22432788
This last number is the sum of my input reads of this sample.
But When I look at the number of reads in the unmapped.fastq
file after the mapping it corresponds to the third row of the file (x2) = 8079466 x2 = 16158932?
But than adding the mapped and unmapped reads together give me more reads, than the there were in total.
Can anyone help me understand what is happening here?
How do I calculate the number of unmapped reads in a paired-end mapped sample?
thanks Assa
What was the exact command you used? There are a few options for outputting various subsets of unmapped reads, it's likely you used the wrong one.
This is the commands i use
The file I need in the next step is
firstRun.Unmapped.gz
.I didn't subset the bam file with samtools, I thought the unmapped reads are outputted to this file.
Just to add - this is the output of the same file when running the
samtools flagstat
commandI can't see how the numbers of the stat (bowtie2 output) file and the flagstat output correlate.