Unsynchronized simultaneous reads on a single .bam file
1
0
Entering edit mode
7.2 years ago

Hello,

I am running samtools mpileup multiple times simultaneously on the same .bam file, and I am wondering if reading the same .bam file simultaneously could cause any problems/corrupt my output? For example, if I ran these commands simultaneously:

samtools mpileup -v -u --region chr1:1-1000 --output region1_mileup_output.txt my_reads.bam
samtools mpileup -v -u --region chr2:1-1000 --output region2_mileup_output.txt my_reads.bam
samtools mpileup -v -u --region chr3:1-1000 --output region3_mileup_output.txt my_reads.bam
  1. I know I could list the specific regions in the .bed file and input that into mpileup, but for my purposes it is most convenient to have the reads for the regions outputted to separate output files.
  2. I know I could run the commands sequentially, but I would prefer to run them simultaneously to save time.

I am planning on scaling up and simultaneously reading a .bam file a few hundred times with different regions specified. Will this become a problem once I scale up?

Thanks!

samtools bam mpileup • 1.3k views
ADD COMMENT
1
Entering edit mode

Consider using samtools mpileup on the whole chromosome, then use BEDTools intersect to get the overlap between your vcf and a bedfile of locations.

ADD REPLY
1
Entering edit mode
7.2 years ago

Simultaneous reads don't cause problems; concurrency problems only occur when there is a mix of reads and writes. 100% reads or 100% writes are fine (100% writes is fine because the data is never read so the final state cannot be observed).

Some filesystems may perform better or worse with lots of simultaneous processes reading from the same file, so just make a note of whether the performance dramatically declines.

As swbarnes2 notes it's more efficient to read the full file once.

ADD COMMENT

Login before adding your answer.

Traffic: 3245 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6