However, I think it is not taking multiple files altogether. I am getting vcf for only one sample but I want one vcf for all samples together. Can you please help me figure out how to get a single vcf files for all the sorted bam files in INPUT_DIR. Thank you for your time and help!
You need to run HaplotypeCaller individually for each of your bam files using the "-ERC GVCF" flag. This will produce once gvcf file for each of your bam files. Then you would combine each of the GVCF files produced by HaplotypeCaller (e.g., with gatk's CombineGVCFs tool) into a single GVCF file. Finally, you run GenotypeGVCFs on the combined GVCF file to get your SNPs.
So for example, run Haplotype caller like this for each of your bams:
Even after moving INPUT= into the loop, it still considers only the last sample and generated both gatk.bam and index file for one sample only. Thank you!
Thank you, Pierre! I tried running:
But instead of reading all samples, it just reads the last sample in the directory.
move
INPUT=...
into the loop... of course
Hi Perre!
Even after moving
INPUT=
into the loop, it still considers only the last sample and generated bothgatk.bam
andindex
file for one sample only. Thank you!show us the code...
Hello Pierre,
Here is the code:
INPUT should extract the data from the "${BAM}" variable.
At this point, I'm sorry, but I'll stop helping you.
Duplicate answer....