Hello,
I have three vcf
files created with bwa
and bcftools
. I have merged these three files using bcftools-merge
command as following:
bcftools merge file1.vcf.gz fle2.vcf.gz file3.vcf.gz > out.vcf
And a snippet for file out.vcf
looks like this:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT file1.sorted.bam file2.sorted.bam file3.sorted.bam
1 687 . G A 42.4147 . VDB=0.0298006;SGB=-0.556411;MQ0F=0.5;MQ=19;DP=4;DP4=0,0,0,4;AN=2;AC=2 GT:PL ./.:. 1/1:72,12,0 ./.:.
1 689 . G A 42.4147 . VDB=0.0298006;SGB=-0.556411;MQ0F=0.5;MQ=19;DP=4;DP4=0,0,0,4;AN=2;AC=2 GT:PL ./.:. 1/1:72,12,0 ./.:.
1 701 . T A 42.4147 . VDB=0.0221621;SGB=-0.556411;MQ0F=0.5;MQ=19;DP=4;DP4=0,0,0,4;AN=2;AC=2 GT:PL ./.:. 1/1:72,12,0 ./.:.
1 704 . T G 42.4147 . VDB=0.0190094;SGB=-0.556411;MQ0F=0.5;MQ=19;DP=4;DP4=0,0,0,4;AN=2;AC=2 GT:PL ./.:. 1/1:72,12,0 ./.:.
1 708 . C T,A 20.4535 . SGB=-0.379885;MQ0F=0;MQ=50;DP=2;DP4=0,0,0,2;AN=4;AC=2,2 GT:PL 1/1:50,3,0,.,.,. ./.:. 2/2:40,.,.,3,.,0
Here, for example, for position 708, there are two ALT
bases (T, A
) which means at this position for reference base C
, file1.sorted.bam
has an ALT
base, T
; file2.sorted.bam
has the same base as a reference that's why it has ./.:.
and file3.sorted.bam
has an ALT
base, A
.
So, using bcftools
I want to make a file which looks something like this where the individual file columns have their assigned bases (both missing as reference base and an ATL
base) so that I can use this vcf
for further analysis:
#CHROM ID REF ALT file1.sorted.bam file2.sorted.bam file3.sorted.bam
1 687 G A G A G
1 689 G A G A G
1 701 T A T A T
1 704 T G T G T
1 708 C T,A T C A
I have looked into other posts where GATK
is used for such jobs but it also requires to use GATK
to call the variants. For this analysis, I have to use bcftools
for calling variants but I can use other software to merge
the vcf files to get the required result.
Thank you so much for your help!
That's not a VCF anymore, and I would strongly recommend to stick to commonly used file formats. That said, if I would have to solve your issue I would use python and the cyvcf2 module.
That was my mistake to say I want a
vcf
file. You are right it won't be avcf
anymore. I have edited my question. Thank you for the suggestion. I have never usedpython
or thecyvcf2
module. I will try them. Thank you!