Discrepancy in pysam fetching region
1
0
Entering edit mode
2.5 years ago
mhasa006 ▴ 70

I'm trying to fetch a region using pysam. My intention is to extract all the regions that have the same reference start position. I'm working with single-end reads.

Let's say the position is x. Using pysam I'm running the following command to first get all the reads that have any overlapping with the position x.

for read in BAMFile.fetch(contig, x, x+1):
    process_read

I counted the number of reads that are contig_x and found the number of reads is 11.

To confirm I checked the original BAM file with samtools and grep the position x with the following command

samtools view chr1.temp.bam | grep x

This returned 45 reads with the contig_x.

Is it possible that pysam is not fetching all the reads? Is there any reason/explanation behind that? Or am I doing something wrong?

pysam python fetch bam • 898 views
ADD COMMENT
2
Entering edit mode
2.5 years ago
mhasa006 ▴ 70

Found the reason.

The for loop is reading positions from a BED file. The BED file is a zero-based coordinate start. However, the BAM file is 1 based coordinate.

So when counting the reads from the original BAM file with samtools, the following should be used

samtools view chr1.temp.bam | grep x + 1
ADD COMMENT

Login before adding your answer.

Traffic: 2607 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6