Entering edit mode
18 months ago
miguellarrazlopezdenovales
▴
20
Hello everyone,
I am looking to filter my VCF file with multiple samples to keep those variants that vary with respect to the reference (i.e., not 0/0) in more than x samples. For example, if x is 10, I want to only keep SNPs that vary in more than 5 samples in my dataset. Thanks in advance
use a filtering expression https://samtools.github.io/bcftools/bcftools.html#expressions , something like;
What have you tried? You've mentioned bcftools as a tag, have you read the manual?
I've tried bcftools, which seems to be the closest I've got, but I can't find anything to do this. I have also found vcftools --max-non-ref-ac, which I think might be able to do this, but I am not really sure how to use it, and was wondering if anyone has any experience here. Thanks
It's in there. If you had looked a little deeper, you would have found - on your own - exactly what Pierre has pointed to below.