What we have in SAM spec:
- 256 for secondary alignment
- 4 for unmapped
- Multiple mapping :One of these alignments is considered primary. All
the other alignments have the secondary alignment flag set in
the SAM records that represent them.
- NH i Number of reported alignments that contains the query in the
current record
Keeping the above in mind, if the aligned bam file is from tophat or STAR (unmapped not included)
samtools view -F 256 should keep out secondary giving primary aligned only.
On the other hand if the bam is from bowtie2 or bwa or so (having unmapped included in the same bam)
We need to use flag 4 as well (256 + 4 ->260).Hence
samtools view -F 260 would be useful in that case
Now as asked, there should indeed be a connection between primary and uniquely aligned owing to the fact that, uniquely aligned reads will have one primary alignment only and no secondary alignments. But, I doubt there is any FLAG per se that could fetch the uniquely aligned directly.
Instead we have to rather rely on mapping quality and NH tag but there is indeed a problem here,
Though the specification gives a MAPQ field, it do not specify any particular value for uniquely mapped.In other words, the quality value for uniquely mapped is dependent on the aligner used. For example, STAR specifies a 255 value for uniquely mapped.
Another option is the NH field specification mentioned at the top from SAM spec. Accordingly NH:i:1 should indicate a uniquely mapped.
But in the certain cases where the quality value used for unique alignments is not clearly specified and the NH field also is not used, the flawless indicator of unique mapping is something I am searching for months and yet to get an answer.
I look forward to better answers (and corrections if any) touching the supplementary alignment flag as well if relevant.
Jf
thanks for a good answer. I just have a few comments. Does 256 for secondary alignment apply to both SE and PE reads? Also do you know how to find out if a particular aligner output unmapped reads in bam besides counting the FLAG on the bam file?
Have been working with PE data only and 256 flag was quite fine (The flag is very useful to find out the correct alignment percentages when multi-mapping is allowed). And I cannot think of any reason why there should any trouble with SE.
Regarding the second part of the question, a quick rush through the manual should do good. And if it is first time with a new aligner, I rather give a check with flag 4 itself reason being, sometimes we miss the subtle but important details in the manual in the rush. What I know is below,
Tophat and STAR -> separate mapped and unmapped.
bowtie2 and bwa -> mapped and unmapped together.
Jf
I can add one:
hisat2 -> mapped and unmapped together
From my experience, STAR and hisat2 can generate NH:i: field. Do you know if tophat and bowtie2 can generate NH tag? From what I can find, they don't seem to do so.
In STAR, you can change the default value of 255 for uniquely mapped alignments to a value you specify with
--outSAMmapqUnique
Not sure of bowtie2.But tophat does give NH.
bowtie2 has XS and AS tags that report primary score (AS) and secondary score (XS). If they are similar it means that there are several (at least 2) similar alignments (similar/same aligning score in two different locations). You can use them to extract the uniquely aligning reads where unique means: aligns only in one location with best score but could align somewhere else with a worse score.
XS tag is absent in reads who have a single alignment.
Today I came across a blog from Simon Andrews regarding MAPQ scores from different aligners which is quite useful in the context of unique alignment. Link