Calculating inner mate distance for paired reads
3
3
Entering edit mode
8.0 years ago
dgimode ▴ 70

Is there a way to conveniently calculate/derive the inner mate distance between paired-end reads in a bam file? I want the distance from the right most end of the left read to the left most end of the right read and I want this for all the paired reads that map in the bam file.

Thanks

alignment bam • 5.5k views
ADD COMMENT
1
Entering edit mode
8.0 years ago

bamPEFragmentSize from deepTools will output the mean/median fragment and read sizes. The expected inner mate distance is then just expected fragment size - 2 * expected read size.

ADD COMMENT
1
Entering edit mode

bamPEFragmentSize gives out a summary for all the reads in the bam file. However, I want to get the inner mate distance for each read pair as opposed to the expected inner mate distance based on the summary. Is there a way to achieve this?

ADD REPLY
1
Entering edit mode

It's actually particularly hard, because depending on the arrangement of the reads you might not have all the information required from a single alignment (you need to know the sequence of the mate, PSEQ, or even better the CIGAR string of the mate, PCIGAR, both of which don't exist in BAM). Funnily enough, this was the question the prompted me asking if there was a file format that had 1 row per sequenced DNA fragment, because for such a file format this question would be trivial to answer. It's only because BAM deals with alignment rather than fragments it's tricky. However, there is always 1 alignment in the pair that will let you calculate it:

http://imgur.com/a/2q3S9

Here you're looking for RI and MI (Read Inner, Mate Inner), so you can subtract the two to get the insert size. Can you use pysam? If so, the code need to do this would be easy enough. You just have to write a rule that targets the read types where you can calculate the RI/MI, and ignore the corresponding read type. And ignore secondary alignments.

ADD REPLY
0
Entering edit mode

I am going to try write a pysam script to do this. Thanks for the helpful input.

ADD REPLY
0
Entering edit mode

Is there any way in samtools to do this?

Thanks.

ADD REPLY
1
Entering edit mode

Probably not.

ADD REPLY
0
Entering edit mode
7.4 years ago
bioinfo8 ▴ 230

I have used CollectInsertSizeMetrics.jar (ref: Picard MarkDuplicates error after using bamaddrg: "bin field of BAM record does not equal value computed..." ) and got the attached insert size histogram for my bam file (which includes properly-paired QC-passed reads, sorted and indexed). However, I am unable to understand it properly.

In this post Mystery: What is going on with this Insert Size Metrics Histogram I can see both FR and RF in the histogram, but in mine I cannot.

Please guide.

Thanks!enter image description here

ADD COMMENT
0
Entering edit mode

Please post this as a new question.

ADD REPLY
1
Entering edit mode
ADD REPLY
0
Entering edit mode
5.7 years ago

I know its been a while, but were you able to find a solution to this problem?

ADD COMMENT

Login before adding your answer.

Traffic: 2123 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6