I am running into problem which has been asked in this forum but no right solution has be provided so far.
I have to regularly subset and merge several samples vcf during my data processing. But, I realize that a important information like AF - allele frequency
, AC - allele count
etc. aren't getting updated as need be.
I have used GATK and it is able to remove several sites that are: nonVariant, unused alternate, etc. etc. But, the important allele level information aren't getting update as required. I also tried VCF tools, BCFtools, vcf-merge but no help so far.
One, tool
https://github.com/opencb/hpg-variant
https://github.com/opencb/hpg-variant/wiki/VCF-Tools-tutorial
seems to be helpful, but there is problem with the installation because the of the required (old) package version.
Any suggestions ?
AF
andAC
are changed only when you subset samples. Subsetting sites won't affect their values in any way. Can you show me examples of a command that are not changingAC
andAF
values? All the tools I've tried ([bv]cftools
,SelectVariants
, etc) do this very well for me.Ram compare the given two VCF output at same site:
The AF is 1(alt allele)/8 (total alleles) = 0.125, which is correct. This is because all the samples were first split and them merged.
But, I now add two extra samples and AF now should be 1/14 = 0.0714, but the AF is still at old value.
So, I am guessing that AF might be changing when subsetting, but not while merging.
Can you paste the individual records for all 7 samples please? I'm curious about what's going on here.