Hello,
for each chromosome, I have a multisample VCF containing the same set of samples. However, the order of the sample columns varies among the VCFs.
For example:
VCF file 1:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 14 11 13 12 3 6 10 7 9 5 8 2 4 1
VCF file 2:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 5 9 11 13 2 7 12 6 10 14 4 3 8 1
I'd like to merge them so that the samples with the same ID are concatenated. However, since the sample order differs, I can't use bcftools concat...
Any suggestions of how I could either
- change the order of the samples in every multisample-vcf so that I can use bcftools
- use a different tool to concatenate the VCF files
Cheers
Hi Pierre,
thanks for you answer!
I tried picard GatherVcfs
java -jar $PICARD_HOME/picard.jar GatherVcfs I=35.vcf I=36.vcf O=all.vcf
The VCFs have the same sample IDs but in different order:
bcftools query -l 35.vcf
14 11 13 12 3 6 10 7 9 5 8 2 4 1
bcftools query -l 36.vcf
5 9 11 13 2 7 12 6 10 14 4 3 8 1
I get an error message from plink saying that the sample IDs don't match. However it can't find any unique names, which makes me think that the problem is, again, the order..
Okay, I have tried
instead of
and it seems to work!! thank you very much