I am trying to merge 3 sets of bcfs that contain hundreds of millions of sites for tens of thousands of samples. These bcfs have the exact same sites, just all different samples. using bcftools merge 1.bcf 2.bcf 3.bcf -Ob > merged.bcf
looks like it is going to take days, maybe even weeks at the rate it is going. Even though they are both binary formats, would it be faster to first convert these each to plink then:
plink --make-bed --merge-list merge_list.txt --out merged
where merge_list.txt is a list of my binary plinks for each bcf:
1
2
3
Oh darn it. thank you!