Confused about read pair orientation in bam
0
2
Entering edit mode
5.6 years ago
malthesr ▴ 30

I am confused by the orientation of reads in read pairs in a bam file I am looking at. I have aligned paired end Illumina read data with bwa mem -M. The log states

[M::process] read 36146 sequences (2937732 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 2961, 0, 0)

which I take to mean that bwa thinks all my pairs are FR, as I gather is to be expected from paired end sequencing (though I am confused that bwa reads 36146 sequences and only finds 2961 unique pairs). For downstream processing, I have since run samtools sort -n and used the bbtools reformat.sh to retain bit 3 and remove bit 3844, as well as update to sam format 1.4. There are now only reads left with bit flags 83, 99, 147, and 163, as confirmed by

samtools view "${bam}" | awk '{print $2}' | sort | uniq

Based on the two above facts, I'm therefore assuming that all my reads are FR, inward-facing. However, when I run samtools stats "${bam}", I see that

SN      inward oriented pairs:  7028
SN      outward oriented pairs: 4196
SN      pairs with other orientation:   0
SN      pairs on different chromosomes: 0

which confuses me. A bit of manual inspection seems to corroborate this. Running

samtools view "${bam}" | 
awk 'BEGIN{print "id", "flag", "start", "end"} {split($1,a,"."); print a[2], $2, $4, $4+length($10)}'

I get e.g. (recall that these are sorted by name):

id    flag  start      end
19    83    10768      10869
19    163   10760      10861
2117  83    10241      10342
2117  163   10285      10386
2654  99    11558      11657
2654  147   11539      11633
2893  83    133841665  133841766
2893  163   133841620  133841718
6029  83    176995687  176995788
6029  163   176995614  176995710

Flags 83 and 147 tell me that the read is reverse. Now, if the reads were FR, in paired end sequencing I would expect the reverse read to lie after the forward read, which is true for ids 19, 2893, and 6029, but not for ids 2117 or 2654, in accordance with the results from samtools stats.

Based on the above, I am confused as to whether my read pairs are properly oriented. What is going on here?

alignment bam bwa paired end • 3.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 1679 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6