How to Create a multi-sample VCF file
1
0
Entering edit mode
2.3 years ago

Hello everyone, I have 3 samples on which I have performed variant calling seperately. Now I want to combine all three of them to get a single VCF file in the following format:

CHROM  POS ID  REF ALT QUAL  FILTER  INFO Sample1 sample2 sample3

The VCFs are from individual samples. I could only find tools to merge VCFs of a single sample. Also my individual VCFs does not have sample names. Any input on this would be helpful !

Thanks in advance.

VCF multi-sample • 3.8k views
ADD COMMENT
0
Entering edit mode
2.3 years ago
raphael.B ▴ 520

Hello, You can use bcftools merge

ADD COMMENT
0
Entering edit mode

Hi, that requires sample names in the vcf files. But mine doesn't have sample names.

ADD REPLY
1
Entering edit mode

if there is no sample name, so there is no genotype, so 'merging' vcfs is meaningless.

ADD REPLY
0
Entering edit mode

Show us the output to:

grep -A2 -E "^#CHR" file.vcf | column -t -s $'\t'

If the VCF is gzipped, use zgrep instead of grep.

ADD REPLY
0
Entering edit mode

Hi Ram, here's the output:

#CHROM  POS    ID  REF  ALT  QUAL     FILTER  INFO                                                                                                                                                                                                                                                                                                                                                             FORMAT                unknown
1       2097   .   G    A    606.259  .       AB=0.490196;ABP=3.05288;AC=1;AF=0.5;AN=2;AO=25;CIGAR=1X;DP=51;DPB=51;DPRA=0;EPP=3.09716;EPPR=4.34659;GTI=0;LEN=1;MEANALT=1;MQM=59.36;MQMR=60;NS=1;NUMALT=1;ODDS=139.596;PAIRED=0.12;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=913;QR=924;RO=26;RPL=11;RPP=3.79203;RPPR=6.01695;RPR=14;RUN=1;SAF=7;SAP=13.5202;SAR=18;SRF=13;SRP=3.0103;SRR=13;TYPE=snp                GT:DP:RO:QR:AO:QA:GL  0/1:51:26:924:25:913:-67.1143,0,-68.1156
1       48495  .   A    G    157.261  .       AB=0.259542;ABP=68.8009;AC=1;AF=0.5;AN=2;AO=34;CIGAR=1X;DP=131;DPB=131;DPRA=0;EPP=19.3602;EPPR=3.21178;GTI=0;LEN=1;MEANALT=1;MQM=28.1471;MQMR=29.8969;NS=1;NUMALT=1;ODDS=36.2106;PAIRED=1;PAIREDR=0.896907;PAO=0;PQA=0;PQR=0;PRO=0;QA=1234;QR=3589;RO=97;RPL=7;RPP=28.557;RPPR=17.0017;RPR=27;RUN=1;SAF=32;SAP=60.4905;SAR=2;SRF=70;SRP=44.4026;SRR=27;TYPE=snp  GT:DP:RO:QR:AO:QA:GL  0/1:131:97:3589:34:1234:-39.1919,0,-190.138 
ADD REPLY
0
Entering edit mode

as I said, without genotype 'merging' vcfs is meaningless. May be you just want bcftools concat

ADD REPLY
0
Entering edit mode

Hi @Pierre Lindenbaum, Would you please explain why it is meaningless? I used bcftools reheader to rename the 'unknown' to desired sample name. Then used bcftools merge and it worked. I have very little knowledge in this area. Please correct me if I am wrong.

ADD REPLY
1
Entering edit mode

You're not wrong - The SAMPLE fields are optional in VCF and only when there are no SAMPLE field is merging meaningless. This approach (rename and merge) would have been my recommendation too, so you've solved it yourself. You do have a genotype field and the merge process is routine, not meaningless in the slightest.

ADD REPLY
0
Entering edit mode

Alright. Thank you for the explanation.

ADD REPLY

Login before adding your answer.

Traffic: 1956 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6