Question

Programming Challenge: Quickest Way To Determine The "Superpopulation" From A Vcf?

4

Entering edit mode

10.9 years ago

Jeremy Leipzig 22k

Given an exome or targeted human VCF of one or more samples, I need a program to determine the "superpopulation" of each sample, as listed here:

http://www.1000genomes.org/category/frequently-asked-questions/population

ASN EUR AFR AMR SAN

The program should return a single three letter code for each sample.

Submissions will be judged on speed using 10 randomly selected subsets of 1KG samples - you cannot count on any "crucial" regions being covered.

Each "miss" will result in a penalty that is effectively 50% of the best time for the next best tier (a miss of one call will tack on half the entire time it took to call all 10 correctly)

vcf • 2.8k views

ADD COMMENT • link updated 9.2 years ago by Biostar 20 • written 10.9 years ago by Jeremy Leipzig 22k

0

Entering edit mode

So what am I allowed, if I cannot count on any specific region being there? How targeted could it be? Clearly some target regions will be uninformative...

ADD REPLY • link 10.9 years ago by zam.iqbal.genome ★ 1.9k

0

Entering edit mode

sometimes we receive targeted resequencing samples that are, for example, just a bunch of cardiac genes. I would still like to make a guess as to the superpopulation.

ADD REPLY • link 10.9 years ago by Jeremy Leipzig 22k

score 0 · Answer 1 · 2013-12-27

0

Entering edit mode

10.9 years ago

zam.iqbal.genome ★ 1.9k

Actually, my previous comment is an attempt at an answer, so here it is (will try and delete the comment):

Well, in general I would expect it not always to be possible. I would take the 1000Genomes SNP calls in your gene(s), and do a PCA (colouring each sample by population), and see if the super populations are evident in the PCA. If yes, it's very cheap to do a quick PCA for your sample and see where it lies compared with the 1000G populations. That's what I'd do, but I'm not an expert on that type of thing!

ADD COMMENT • link 10.9 years ago by zam.iqbal.genome ★ 1.9k

0

Entering edit mode

Automate and implement

ADD REPLY • link 10.9 years ago by Jeremy Leipzig 22k

1

Entering edit mode

You're quite right, you asked for a program, not description of how to do it. I don't have time to do this now though, so I'll bow out of the rest of this discussion

ADD REPLY • link 10.9 years ago by zam.iqbal.genome ★ 1.9k