Hi,
Given a VCF file that I want to use GATK SimulateReadsForVariants to generate simulated data containing those variants (mostly indels) is there a way to specify that the variants be generated at a specific allele frequency?
This is the package I hope to use: https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_simulatereads_SimulateReadsForVariants.php
I am testing the sensitivity of various indel calling algorithms and want to see how these algorithms perform when the variants are at 10%, 5%, 1% etc. It is okay if it is a naive method of simulating the data at this point, because we are only in the first steps of this study, but I would also appreciate advice on more advanced techniques if you have any :)
Thanks!