Entering edit mode
8 months ago
K
•
0
I am trying to generate a bcf file through samtools mpileup but it taking too long, how long does it usually take to generate a bcf file?
My bam file is 26.9 gb, and the sorted bam file is 17.6 gb.
It can be slow, but I don't have an exact time. It's easy for you to get this though and it'd be more accurate than anything people here can give you as hardware differs. Index it and time it on a smaller region. Extrapolate from there.
A few pointers though.
bcftools mpileup
instead.-X
. This can dramatically improve indel calling recall and precision.parallel
to process each region and spit out a new file. You can then justbcftools concat
these together in order to get the final mpileup.The size shouldn't differ between sorted and unsorted files.
Maybe this is sam file? and what you're refering as sorted bam is bam file. But, 30Gb sam should be compressed to a smaller file than 18Gb.
I suggest you post the commands you used as well as the what you're sequencing.
Not quite. Sorted BAM files may have similar elements closer so they compress better. e.g. Size of BAM file reduces after sorting with samtools
Huh, does this happens with merged bams?
Now I think about it, I never written down an unsorted bam to notice the difference between sorted/unsorted file sizes and just assumed they would be similiar.
Bad assumtion :/, thanks for correcting.
My sam file is 126 gb