Select one variant from multiple in ALT in VCF file
1
0
Entering edit mode
3.7 years ago
FL512 ▴ 20

Let's say if my VCF file has a C in REF and A,TTA in ALT (sometimes even more than three variants in REF column though), which would interrupt my analyses because the tools I would like to use required only one variant in ALT.

Therefore, I want to select one ALT based on the value of minor allele frequency (I believe this criterion is reasonable). Header of my VCF and each data are as follows;

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR
chr1 10000 . C A,TTA . . DP=100;ECNT=2;MBQ= .......... GT:AD:AF:DP 0/0:22,10,0:0.290,0.031:30  0/1/2:60,18,6:0.210,0.073:90

How could I make a final VCF like

chr1 10000 . C A . . DP=100;ECNT=2;MBQ= ..........

which I want keep the variant that has a highest minor allele frequency.

Are there any tools or any kind of script available, or if someone has already addressed this, please kindly let me know! Thank you.

VCF Filter multiplevariants WGS • 1.9k views
ADD COMMENT
1
Entering edit mode
3.7 years ago
Medhat 9.8k

bcftools norm -m is able to split multi-allelic sites, then you may filter by allele frequency

ADD COMMENT
0
Entering edit mode

Hi Medhat,

Thank you very much for your quick response. I took a look at the bcftools instruction and looks like that is what I was looking for. I will give it a try anyway.

Thank you again!

ADD REPLY

Login before adding your answer.

Traffic: 1780 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6