Question

Population Private SNPs

1

Entering edit mode

5.8 years ago

ThePlaintiff ▴ 90

I am working on 1000Genome data. I'd like to find for every population SNPs that are only found in a selected population (population private SNPs). Now, how I'd go about it is to recursively find the difference between sets of SNPs in different populations say for YRI and LWK, I'd get all the SNPs in YRI and filter out the SNPs that are shared between YRI and LWK. I'd repeat the exercise for the other populations. I tend to think that this kind of a functionality would have been implemented in one of the VCF analysis tools or genome analysis software if you know of a command or pipeline that implements this functionality please let me know. I could code up the solution but it'd save me a great deal of time if I could avoid redundancy.

next-gen SNP • 2.3k views

ADD COMMENT • link updated 4.7 years ago by dawson.white ▴ 10 • written 5.8 years ago by ThePlaintiff ▴ 90

1

Entering edit mode

Don't know about exiting tools. But we could get frequency per population, then use set operations to get SNP lists?

ADD REPLY • link 5.8 years ago by zx8754 12k

0

Entering edit mode

^^ It does indeed seem to be as straight forward as how zx8754 describes. The allele frequency data can be used to infer alleles that are only present in one population group or another. If I was actively working on this, I would spend some time to get the 1000 Genomes data into a single BCF and also a PLINK dataset, where it would then be easier to work with it.

ADD REPLY • link 5.8 years ago by Kevin Blighe 88k

0

Entering edit mode

Thanks, I have one more question. I split the bed files by sub-population by running plink --bfile <MyFile.bed> --keep </path/to/sample/ids>. How do I test whether the allele frequencies are different across populations. I am considering 7 sub-populations of the 1000Genome data set for my analysis. I believe that I'll need to build a phenotype file for this. I am not clear on how to build the file and run it on plink. I would appreciate a format of the file and possibly plink commands.

ADD REPLY • link 5.7 years ago by ThePlaintiff ▴ 90

score 1 · Answer 1 · 2020-03-13

1

Entering edit mode

4.7 years ago

dawson.white ▴ 10

vcf-contrast is designed to do this. http://vcftools.sourceforge.net/perl_module.html#vcf-contrast

ADD COMMENT • link 4.7 years ago by dawson.white ▴ 10