Hi,
I have vcf files of 25 samples (all of them prepared using freebayes with same reference). I want to extract SNPs which are present in at least 80% of samples (i.e. present in any 20 samples). Kindly help me with it.
I have tried "bcftools isec". It gives me output of those SNPs which are present in at least 20 samples (what I want). But whichever sample was inputted first in file list will be used as a reference. Because of these, only SNPs which are present in my first sample along with any other 19 samples are outputted (This is what I don't want). It should output SNPs present in any 20 samples.
Hope I have explained my problem clearly.
Ankit.
Thanks for reply...
I am guessing I need to merge all vcf file and then use this...