Dear all,
I merged the g.vcf.gz files of my WES samples and would like to convert merged.g.vcf.gz
to merged.vcf
with GATK GenotypeGVCFs tool
. My main aim is to do a single variant association test on this merged.vcf
file, which contains cases and controls. The GATK version that I use is 4.1.4.1.
gatk=/usr/local/bin/gatk-4.1.4.1/gatk-package-4.1.4.1-local.jar
java -Xms128g -Xmx128g -jar $gatk GenotypeGVCFs \
-R hg19.p13.plusMT.no_alt_analysis_set.fa \
-V merged.g.vcf.gz \
-O merged.vcf.gz \
-D dbsnp_138.hg19.vcf \
-G StandardAnnotation \
--only-output-calls-starting-in-intervals \
--use-new-qual-calculator \
--merge-input-intervals \
-L region.bed 2> logs/GenotypeGVCFs.log
However, these samples were sequenced with three different WES kits, and that's why they have three different region.bed
files. For the genotyping, I need to provide an interval file (-L region.bed
above), however, I don't know which BED file I should use. Should I prepare a new BED file that involves all the regions in these three BED files or the regions at the intersection of these three BED files?
I would be grateful if you could help me with this issue or tell me the general approach in these situations.
All the best,