Let's say if my VCF file has a C in REF and A,TTA in ALT (sometimes even more than three variants in REF column though), which would interrupt my analyses because the tools I would like to use required only one variant in ALT.
Therefore, I want to select one ALT based on the value of minor allele frequency (I believe this criterion is reasonable). Header of my VCF and each data are as follows;
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR
chr1 10000 . C A,TTA . . DP=100;ECNT=2;MBQ= .......... GT:AD:AF:DP 0/0:22,10,0:0.290,0.031:30 0/1/2:60,18,6:0.210,0.073:90
How could I make a final VCF like
chr1 10000 . C A . . DP=100;ECNT=2;MBQ= ..........
which I want keep the variant that has a highest minor allele frequency.
Are there any tools or any kind of script available, or if someone has already addressed this, please kindly let me know! Thank you.
Hi Medhat,
Thank you very much for your quick response. I took a look at the bcftools instruction and looks like that is what I was looking for. I will give it a try anyway.
Thank you again!