Hi everyone,
I'm processing .vcf files for the first time, and am hoping for some advice with something. I recently filtered a set of .VCFs for a dataset to exclude around half of the subjects, but now want to update the INFO columns in me output files to reflect the stats of the new filtered dataset.
I've tried using fill-an-ac from vcftools, which has successfully updated the AN/AC fields, but this doesn't updated the allele frequency/minor allele frequency fields. I know that I could use the AN/AC values to derive AF/MAF, but I was just wondering if there is an option in one of the existing tools to update these fields automatically, similar to the function of fill-an-ac?
I'd really appreciate any suggestions you might have.
Thanks in advance!
Thanks for the quick reply. This approach seems like it should be quite straightforward, difficulty is that I'm running this on an HPC cluster, where running java programs is not very straightforward given permissions...etc and modules available on HPC.
Are there any other approaches that are non-java based, or workarounds for HPCs? Thanks again!
why java and not python or whatever ? This program use streaming = doesn't require much memory.