Hi all,
First post, hopefully my forum etiquette is ok!
Anyway, the question: Both the Bowtie2 and Hisat2 manuals state
" Flags relevant to Bowtie2/Hisat2 are: 1: The read is one of a pair 2: The alignment is one end of a proper paired-end alignment ...etc... 128: The read is mate 2 in a pair "
However, my recent Hisat2 runa did output SAM flags above 256 (e.g. 403, 355)...
So which flags are actually supported by these mappers? If I see 355, then is the constituent 256 flag 'real'? If I see no flags above 512 then can I assume that zero reads failed quality check (512) and there are zero PCR or optical duplicates (1024)?
If I can quickly remove duplicate reads and non-primary alignments based on these flags using samtools view that would be great, if not will require using more downstream tools to achieve the same. Either way, need to know what's going on!
Thanks in advance!
Hi i.sudbery, thanks for your quick answer! Thanks for the link, though it's not that I don't know what the flags mean, it's that I don't know which flags Hisat2 and Bowtie2 have the capacity to assign.
e.g. if my Hisat2 run outputs flags such as 403 then I guess I can immediately remove non-primary alignments without further processing... but the Hisat2 manual says it doesn't assign this flag to the SAM files it outputs.
Which makes me wonder, are these mappers also capable of detecting PCR & optical duplicates internally, then assigning the appropriate flag (1024)?
If this is not the case, then I plan to use picard to remove duplicates, but curious if this can be achieved on the raw SAM file that the mappers' create.
Where does it say that Hisat2 doesn't assign 256?
(403 is 1 + 4 + 8 + 64 + 256)
I don't know of any mapper that is able to detect duplicates, although thats not to say one doesn't exist. However, I'm pretty certain that neither Hisat2 nor Bowtie2 can.
The manuals for both state "Flags relevant to Bowtie2/Hisat2 are: ..." and then only list up to 128 for some reason! So I was quite confused when I had flags containing 256 popping up. It seems that must be an accidental omission in the manuals.
Thanks so much for your answers, great to get confirmation that they definitely can't detect duplicates.