Hello,
In order to perform population gnomic analysis, I am trying to merge many and huge variants data (gVCF), such as several dozens Gb, over 20 files.
Bcftools merge and vcf-merge were used so far but very slow to merge those files into just one file.
Do you have any ides to merge huge gVCF files? I want to use variants as many as possible so I used gVCF not for VCF.
Thank you!
Merging files that size as VCFs is always going to be slow. If you don't mind losing some metadata and have a lot of memory at your disposal then converting to plink binary and merging in that format will speed things up a lot.
I also think converting to plink binary and merging is a good solution! I will try it. Thank you.