Hi, I recently obtained a vcf file with 90 samples from different generations. I realized that the file is extremely large. This is because I have a lot of possible sequencing errors, observing AF really closed to 0 or 1, and only in one of the samples out of 90. For this reason, I want to filter the whole file, selecting only the SNPs which at least in one sample has a value between 0.15 and 0.85. Do you know how can I do it?
I tried with this code "bcftools view -i 'sum(AF >= 0.15 && AF <= 0.85) > 0'" but it didn't gave me the expected result. Thank you!
AF is usually an INFO field, not a FORMAT/genotype field. What is the definition of AF in your header file and please, give us an example of such genotypes.
Hi, the definition of AF in my header is this one:
Thank you in advance
I don't understand what AF refers to in this scenario when the genotype is
./.
Are these diploid organisms?