Obtaining all SNPs in LD with tag SNP above threshold (D-prime/LOD) from 1000 Genomes
1
1
Entering edit mode
9.5 years ago

I have a list of SNPs, and I want to find all SNPs in LD with them over a given threshold (D' > 0.99, LOD > 3). I've found tools to do this on the basis of the r2 value (eg HaploReg), but while these generally output the D' statistic, they omit the LOD value.

My current solution is to pull a region 200 kb either side of my tag SNP out of the 1000 Genome vcf files, filter the non-biallelic sites out and then look for all pairwise LD values using HaploView (which also outputs LOD). I then filter this output using the stated threshold.

This takes quite a long time to run, and while it's a workable solution when I'm just looking at a few tag SNPs, I'm going to be doing it on a much larger scale and so would prefer something a little more efficient.

SNP LD LOD 1000 Genome • 3.6k views
ADD COMMENT
0
Entering edit mode
9.5 years ago

With PLINK 1.9,

plink --vcf [1000 genomes VCF] --r2 dprime --ld-snp-list [your SNP list] --ld-window-kb 200 --out [output filename prefix]

should perform the dprime computation more quickly.

Edit: oops, you want the LOD score too? Hmm, I will look into adding that if it's more generally used.

ADD COMMENT
0
Entering edit mode

Yeah, it's easy enough to get the r2 or D' values, but the LOD condition makes it more difficult as it seems to be less widely integrated.

ADD REPLY

Login before adding your answer.

Traffic: 2473 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6