I want to obtain all mapped and unmapped reads from my bamfiles but I don't understand the difference between these sets of flags:
For mapped reads:
samtools view -b -F 4 bamfile > mapped
# versus
samtools view -b -F 4 -F 260 bamfile > mapped
I've seen a ton of tutorials on this and about a small % use this -F 260 flag, which I understand means primary alignment. So does this mean just using -F 4 would extract any mapped read (even with its mate unmapped) and potential secondary alignments? And is this considered an issue?
Also I don't understand why there are sometimes three commands for unmapped reads, or just one command:
For unmapped reads
samtools view -b -f 4 bamfile > -f 4 is unmapped # the most common command, okay simple enough
# versus
samtools view -b -f 4 -F 264 bamfile > unmapped1 # -f 4 is unmapped, but what is -F 264 saying??
The -F 264 is confusing to understand. Isn't that just the combination of flags 4 and 256? So saying -F 264 really reads as NOT unmapped but IS a primary alignment? This is what's confusing me because I do want to keep only reads in the primary alignment but I want them as unmapped. Why isn't this commonly just -f 4 and -F 256? I am basing this on Novocraft and some other tutorials I've seen online.
The other commands to extract remaining unmapped reads:
samtools view -b -f 8 -F 260 bamfile > unmapped2 # -f 8 is a mapped read but mate is unmapped; but -F 260 is the same thing as before, a combo of -f 4 and -f 256, but now the number is different?
Once again, according to samtools flags on the Picard website, 260 translates to mate unmapped (0x8) and not primary alignment (0x100), but using -F 260, wouldn't this mean the opposite? So in fact, it is a mapped mate and it is primary alignment?
The final command:
samtools view -b -f 12 -F 256 bamfile > unmapped3 # -f 12 is read unmapped and mate unmapped; -F 256 is primary alignment; this makes sense to me
I guess at the end of it, if I want mapped and unmapped reads in the primary alignment I would run something like this?
#mapped
samtools view -b -F 4 -F 256 bamfile > mapped
# OR
samtools view -b -F 260 bamfile > mapped
# then sort them
samtools sort -n mapped > mapped_sorted
#unmapped
samtools view -b -f 4 -F 256 bamfile > unmapped1
samtools view -b -f 8 -F 256 bamfile > unmapped2
samtools view -b -f 12 -F 256 bamfile > unmapped3
samtools merge -b unmapped[123] | samtools sort -n - unmapped_sorted
But here I would be keeping the -F 256 flag each time for this to make sense right?
I just want to know what is best practise and if the flags I've chosen make sense because I haven't seen this code in other tutorials but their flags confuse me.
Okay but if I already have the flag
-f 4
and then I want to call NOT primary alignment, it would be-F 256
. So in a sense-f 4
and-F 256
, I don't understand how in the same line of code people write-f 4 -F 260
. It seems like the-F 260
is including the-f 4
flag but instead of a-f
it is-F
which to me is the opposite of what I want in that case?And yes I typo'd I meant 4+256 = 260.
So in Novocraft and other examples they write:
But why isn't it
The
-F 264
meansNOT mate unmapped and not primary alignment?
But that means I am also extracting mapped mates and primary alignment, along with the-f 4
which meansunmapped reads
.