Dear all,
First-time poster here. :)
I was hoping to ask for your recommendations on what to request on a HPC to concatenate ~22 chromosomes (each vcf.gz files is ~15GB, so ~330GB in total).
e.g. with bcftools concat -Oz chr1.vcf.gz chr2.vcf.gz ... chrX.vcf.gz > allchr.vcf.gz
Could I request 32 CPUs each with 16GB of memory (512GB)? Would that work?
Any suggestions at all would be appreciated!!
Thanks in advance
Katherine - There are many ways to resolve that question, none of which should require that much dedicated memory. Irrespective, I really encourage you to tabix index your vcf files. Once this index file is created (which only needs to be done once), you will be able to perform operations that would have taken minutes in seconds, including operations similar to what you propose. VAL