VCF files-Keep all alternative alleles
1
2
Entering edit mode
6.4 years ago
juara ▴ 40

Hello

I have VCF outputs of Mutect2 (GATK4) with no downsampling. Some variants have multiple calls, which are comma separated in ALT column. I would like to preserve all of them before inputting in VEP including their respective TLOD and AF scores into a new VCF file where each alternative allele is in one separate line.

Here is an example:

original VCF file:

chr1 149813407 . G A,T . . DP=17510;ECNT=9;POP_AF=5.000e-08,5.000e-08;TLOD=4.36,49.82 GT:AD:AF:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:SA_MAP_AF:SA_POST_PROB 0/1/2:15000,22,59:0.049,0.051:7274,4,30:7726,18,29:32,33:149,153,144:48,45:13,7:0.010,0.010,3.912e-03:1.000,1.055e-10,3.564e-06

New VCF file:

chr1 149813407 . G A . . DP=17510;ECNT=9;POP_AF=5.000e-08;TLOD=4.36 GT:AD:AF:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:SA_MAP_AF:SA_POST_PROB 0/1/2:15000,22,59:0.049,0.051:7274,4,30:7726,18,29:32,33:149,153,144:48,45:13,7:0.010,0.010,3.912e-03:1.000,1.055e-10,3.564e-06

chr1 149813407 . G T . . DP=17510;ECNT=9;POP_AF=5.000e-08;TLOD=49.82 GT:AD:AF:F1R2:F2R1:MBQ:MFRL:MMQ:MPOS:SA_MAP_AF:SA_POST_PROB 0/1/2:15000,22,59:0.049,0.051:7274,4,30:7726,18,29:32,33:149,153,144:48,45:13,7:0.010,0.010,3.912e-03:1.000,1.055e-10,3.564e-06

I am very new in VCF processing but I have looked at bcftools and vcftools to no avail. I appreciate your help

Thank you

snp bcftools vcf vcftools samtools • 3.8k views
ADD COMMENT
0
Entering edit mode

@OP: . There are tools to do that. I tested on VEP offline and current version doesn't support splitting of alt allele records in VCF output with default parameters VCF as output.

ADD REPLY
0
Entering edit mode

Yes I tried that too. The bcftools comment works though! Thanks

ADD REPLY
4
Entering edit mode
6.4 years ago

https://samtools.github.io/bcftools/bcftools.html

bcftools norm [OPTIONS] file.vcf.gz ", split multiallelic sites into multiple rows"

 -m, --multiallelics -|+[snps|indels|both|any]
    split multiallelic sites into biallelic records (-) or join biallelic sites into multiallelic records (+). An optional type string can follow which controls variant types which should be split or merged together: If only SNP records should be split or merged, specify snps; if both SNPs and indels should be merged separately into two records, specify both; if SNPs and indels should be merged into a single record, specify any.
ADD COMMENT
0
Entering edit mode

Thank you very much for your comment. this works like charm! :) I am just wondering, whether there is a function in bcftools that I can merge FORMAT field data to INFO field with?

ADD REPLY

Login before adding your answer.

Traffic: 2392 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6