Entering edit mode
3.0 years ago
michael.flower.14
▴
200
I've produced a set of about 400 of GVCF files with gatk HaplotypeCaller, with the -ERC GVCF
option. I'd now like to combine them for downstream genotyping and variant recalibration. I believe I can combine with gatk CombineGVCFs.
gatk CombineGVCFs \
-R reference.fasta \
--variant sample1.g.vcf.gz \
--variant sample2.g.vcf.gz \
-O cohort.g.vcf.gz
But what I don't know, is how to input all my 400 GVCF files into CombineGVCFs. I've heard this can be done with the --arguments_file
option, but I don't know how to build such a file?
Any help gratefully received!
Thank you, that seems to allow CombineGVCFs to run.
However, the resulting vcf file seems to only have one sample in it. The columns in the output are as follows:
I'd expected a column of genotypes for each sample (i.e. 400 columns)??
this is unrelated to the original question. Please, validate the answer.
Done thank you