How To Differentiate Between Mate Pair And Paired End Reads Based On Sam Flag
1
3
Entering edit mode
13.3 years ago
Abhi ★ 1.6k

Hi All

Just wondering how can i find out given a sam/bam file how many read pairs are mate pairs ie facing outwards (<--- --->) and how many of them are paired end i.e (----> <-----).

__________________ = Reference

<--------      --------->                                  = Mate Pairs


-------->     <----------                                  = Paired Ends

I can visually see the difference on a browser but if I want to numerically find out based on sam flag I am not able to come up with a right combination.

next-gen sequencing paired • 11k views
ADD COMMENT
7
Entering edit mode
13.3 years ago
Swbarnes2 ★ 1.6k

The magic binary flags for good paired reads are 83,99,147 and 163. Those are properly paired, with one read forwrad, one read reverse.

If the software that you are using to make your sam files won't count mate pairs as properly paired, then you are looking for 81,97,145, and 161. Those will be reads where one is forward, and one is reverse, but where the software for some reason thinks that the pair isn't right, which could be a mate pair.

If mate pairs are being counted as properly paired, then you want pairs where the forward facing one points out, therefore, is ahead of the reverse one.

So in a normally paired run, all the read with binary flags of 147 have negative isizes. So a positive size should mean mate pair. Likewise, if the flag is 99, a positive isize means paired end, a negative isize means mate pair.

If the flag is 83, a positive isize means mate pair, negative means paired end.

If the flag is 163, a negative isize means mate pair, positive means paired end.

ADD COMMENT
0
Entering edit mode

Do you happen to know which aligners count mate pairs as properly paired, which do not? Thanks!

ADD REPLY
0
Entering edit mode

In case someone is wondering where the #s come from here is a useful translation tool: http://picard.sourceforge.net/explain-flags.html

ADD REPLY
0
Entering edit mode

@swbarnes2 : thanks for your comments. I did not mention but in case the read 1/2 alignments are done separately and manual pariring is done then we wont have the insert size rather insert needs to be calculated. Can anything be applied in that case. The kind of data that I have doesnt go well with the default pairing algo eg. BWA. Also could you say what is the significance of negative isize

ADD REPLY

Login before adding your answer.

Traffic: 1613 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6