Clarification regarding SAM flags "mate reverse strand" (flag 16/0x10) and "read reverse strand" (flag 32/0x20)
2
0
Entering edit mode
6 months ago
kalavattam ▴ 280

In paired-end sequencing, each fragment of DNA is sequenced from both ends, giving two reads: one forward and one reverse. When "properly paired," these reads are expected to align to a reference genome facing each other and pointing towards one another with respect to the reference.

When properly paired, the reads that comprise pairs will have one of the four following SAM flags: 83 (0x53), 163 (0xA3), 99 (0x63), or 147 (0x93). These SAM flags contain information on "mate reverse strand" (flag 16/0x10) or "read reverse strand" (flag 32/0x20).

My question is this: "mate reverse strand" (flag 16/0x10) or "read reverse strand" (flag 32/0x20) do not directly relate to the strandedness of the library in terms of library protocols (e.g., stranded vs. non-stranded); instead, they simply indicate the directionality of the read in relation to the reference genome—is that correct? That is, "mate reverse strand" means that a given alignment is considered "the mate" in a read pair and is the reverse complement of the sequence in the reference; "read reverse strand" means that a given alignment is considered "the read" in a read pair and is the reverse complement of the sequence in the reference. Is my understanding correct? If not, please help me understand.

PE BAM SAM flag paired-end • 829 views
ADD COMMENT
1
Entering edit mode
6 months ago
kalavattam ▴ 280

My question is this: "mate reverse strand" (flag 16/0x10) or "read reverse strand" (flag 32/0x20) do not directly relate to the strandedness of the library in terms of library protocols (e.g., stranded vs. non-stranded); instead, they simply indicate the directionality of the read in relation to the reference genome—is that correct?

This is correct. I was conflating "reverse strand" with stranded library preparations.

That is, "mate reverse strand" means that a given alignment is considered "the mate" in a read pair and is the reverse complement of the sequence in the reference; "read reverse strand" means that a given alignment is considered "the read" in a read pair and is the reverse complement of the sequence in the reference. Is my understanding correct?

This is correct. More information is available at this Biostars post. Here is some important context from that post:

The paires of the read pair have the opposite direction. One was sequenced on the + strand and one on the - strand. But in the sam file all information are meant for the + strand and are going from 5'-end to 3'-end. The read whos information must be flipped, get a flag about it.

Here is what this means for 83/163 aligned read pairs.

This is how the two reads look in vivo:

             (Read #1: FLAG 83) 3' <---------- 5'
5' -----------> 3' (Read #2: FLAG 163)
5' ------------------------------------------- 3' (Reference genome)

And this how the two reads look in the BAM file:

                                5' ----------> 3' (Read #1: FLAG 83)
5' -----------> 3' (Read #2: FLAG 163)
5' ------------------------------------------- 3' (Reference genome)

Here is what this means for 99/147 aligned read pairs.

This is how the two reads look in vivo:

            (Read #2: FLAG 147) 3' <---------- 5'
5' -----------> 3' (Read #1: FLAG 99)
5' ------------------------------------------- 3' (Reference genome)

And this how the two reads look in the BAM file:

                                5' ----------> 3' (Read #2: FLAG 147)
5' -----------> 3' (Read #1: FLAG 99)
5' ------------------------------------------- 3' (Reference genome)
ADD COMMENT
0
Entering edit mode
6 months ago

In paired end sequencing, every read has a mate, and the flags all together tell you not only what is up with the read you are looking at, but also what the mate is doing.

"mate reverse strand" means that a given alignment is considered "the mate" in a read pair and is the reverse complement of the sequence in the reference;

It means that the read you are looking at right now has a mate, and the mate is reversed as compared to the reference.

ADD COMMENT
0
Entering edit mode

Thanks for the explanation.

I noticed and corrected a typo in which "read reverse strand" initially read as "mate reverse strand" in the following snippet:

"read reverse strand" means that a given alignment is considered "the read" in a read pair and is the reverse complement of the sequence in the reference.

As an interpretation for "mate reverse strand", you wrote,

It means that the read you are looking at right now has a mate, and the mate is reversed as compared to the reference.

What is the accompanying interpretation for "read reverse strand"?

ADD REPLY

Login before adding your answer.

Traffic: 2718 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6