How do I split 1000 genome VCF files by sub-populations while retaining variants that are only present in the sub-population? For example, if I have 1000 genome chromosome 10 file as chr10.vcf, I'd like to get from it: chr10_LWK.vcf (LWK subpopulation), chr10_YRI.vcf (YRI subpopulation) e.t.c. I then would like to find snps that are present in LWK but absent in YRI using bcftools isec or contrast.
Thanks
Thank you Kevin. Your tutorial provided many insights.