I have VCF data for >100 individuals published by a completely separate group. I have Plink SNP data from the Human Origins array, and I want to merge in these VCF data to the Plink data.
What would be good settings for filtering SNPs. The VCF files have already been filtered for DP >= 10 and GQ >=30. What other filtration settings should I use?
I just need to filter down to the SNPs with I have Plink data for and then to determine which of these calls are good enough to use. Orthologs/paralogs are not particularly relevant to to me. I am not looking at individuals genes, just ancestry and Neanderthal introgression.