Given sequence data from 4 different populations, what are the disadvantages (and advantages) of doing joint/simultaneous variant calling across these 4 populations?
Side note: I already have a list of positions at which I want to call variants.
Given sequence data from 4 different populations, what are the disadvantages (and advantages) of doing joint/simultaneous variant calling across these 4 populations?
Side note: I already have a list of positions at which I want to call variants.
Assuming that your populations are "cohorts" as defined by GATK http://gatkforums.broadinstitute.org/discussion/1186/best-practice-variant-detection-with-the-gatk-v4-for-release-2-0, you may boost the confidence and "rescue" those SNP's which would be discarded in a single cohort. Here what they say on the linked page: "Geraldine_VdAuwera: "When you do multi-sample calling, the genotyper uses the cohort information to evaluate confidence in calls that do not have strong support in individual samples... This should only affect marginal calls, but if you are looking for very rare variants then perhaps it would be better to call samples individually...."
Thank you for your comment @seninp. I will do variant calling on separate cohorts to achieve the best result.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.