How to combine a multi sample VCF from multiple sample VCF and interpret a multi sample vcf of bacterial genome?
2
0
Entering edit mode
5.9 years ago
S AR ▴ 80

I used GATK HaplotypeCaller for Variant calling of 2300 MTB strains. Now i want to make it a multi Sample VCF.

I used CombineVariants but im comfuse that either this is correct or not.

Each of my VCF is isolated means 1 VCF from one individual sample. But GATK says: CombineVariants can be used for combine variant calls that were produced from the same samples but using different methods, for comparison.

But my Variant files are from different samples. I just want to make a multi VCF for comparison purpose of each strains variants. The VCF file generated from CombineVariant is :

Mutltisample.vcf I'm not sure is it giving the proper results?

And how to interpret this union.vcf. Like in GT column it is giving ./. what does that means? and its a haploid genome so the GT should contain a single value but few columns are representing the 1|0, 0|1...

Can anyone help?

Thank you

GATK VCF Variant calling • 6.3k views
ADD COMMENT
0
Entering edit mode
5.9 years ago

You could merge the files with bcftools:

bcftools merge -O z -o merged.vcf.gz sample1.vcf.gz sample2.vcf.gz

ADD COMMENT
0
Entering edit mode

command not working!

ADD REPLY
0
Entering edit mode
5.9 years ago

But GATK says: CombineVariants can be used for combine variant calls that were produced from the same samples but using different methods, for comparison.

yes CombineVariants can merge vcf with the same samples (--genotypeMergeOptions PRIORITIZE ) but in your case, you need to use the option --genotypeMergeOptions REQUIRE_UNIQUE

Require that all samples/genotypes be unique between all inputs.

ADD COMMENT
0
Entering edit mode

Hey Pierre, do you have any advice for using GenotypeGVCF for multiple samples from viruses that are similar but have different reference strains. ie in order to improve read alignment slightly different reference strains were used. But I would like to compare them after even though the reference strains are slightly different. thank you

ADD REPLY

Login before adding your answer.

Traffic: 2850 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6