I am trying to check the coverage in few regions in some WES samples. I really need very few regions, this is the BED file:
2 198266460 198266618
2 198266703 198266860
2 198267274 198267556
2 198267667 198267765
2 198268303 198268494
9 5069919 5070058
9 5073692 5073791
9 5078300 5078450
17 74732949 74732969
20 31021081 31025147
I tried bedtools genomecov but it was slow, I then searched around and found few posts speaking of mosdepth that, based on their README, should take:
mosdepth --by capture.bed sample-output sample.exome.bam
For a 5.5GB exome BAM and all 1,195,764 ensembl exons as the regions, this completes in 1 minute 38 seconds with a single CPU
and mine command on a 8.6 GB WES .bam is already taking more tahn 25 minutes! Here the command:
mosdepth --by ${restrictedBed} restricted-output ${fileGenome}
and similarly with option -t 3
that should use 3 threads.
Am I not getting something? Is there a fast way to get the coverage of a limited set of regions from a BAM??
Thank you very much in advance for any help
Check with
top
if the tool is running at 100% CPU or if any bottlenecks are slowing it down.Are you doing the analysis on a laptop? Time comparisons are not meaningful unless there is metadata for hardware provided. Trying to compare a 15W U series laptop CPU to a 150W Xeon monster is unfair on many levels.
Sorry, on a server machine with 48 CPUs and 128GB RAM
And the file is not on a slow remote disk or something? If IO or network are limiting then fast tools won't make much difference