Question

Resequencing Genes/Regions Identified By Gwas

9

Entering edit mode

14.2 years ago

Ryan D ★ 3.4k

We have a list of SNPs that were highly significant associations by independent GWAS. Now we are looking at re-sequencing 1) genes that were in LD with these variants and 2) regions in LD with these variants. We will enrich for regions using SureSelect to get our targets to something manageable (after identifying exons/miRNA binding sites/ESTs etc.).

My question is this: what criteria should be used to draw boundaries (say a window) around a GWAS variant--particularly those in intergenic regions? I don't like the idea of using r2 measures, since presumably other SNPs in high r2 with the common variants detected by GWAS would have probably come up. I prefer the idea of looking in an LD block defined by some measure. Since r2 is not reliably estimated for low MAF variants from HapMap samples, I thought of using D' (D prime). The issue is that some measures of D'=1 can expand hundreds of kilobases--clearly beyond where real recombination takes place. What other measures or empirical boundaries around a SNP could be used to refine the region subject to resequencing? Preferably something easy to explain in a paper. The goal of the project is to find low MAF (and presumably higher penetrance) variants in these regions not necessarily detectable by GWAS.

Thoughts? Thanks.

gwas linkage next-gen sequencing • 5.2k views

ADD COMMENT • link updated 11.0 years ago by Biostar 20 • written 14.2 years ago by Ryan D ★ 3.4k

1

Entering edit mode

Here's a paper where they used this approach, granted in a region with a gene. In this case D' values between the GWAS SNP and the rare variants they uncover appear to be 1. Any other papers on resequencing of GWAS regions with success? http://www.sciencemag.org/content/324/5925/387.full

ADD REPLY • link 14.2 years ago by Ryan D ★ 3.4k

score 5 · Answer 1 · 2011-02-09

5

Entering edit mode

14.2 years ago

Larry_Parnell 16k

Great, timely question. You have identified one of the issues with D' yourself. We tend not to use D' for this, and other, reasons. I'd suggest looking across the region of interest for important and relevant gene expression differences. It's not that you need to do the experiments yourself but take a look at expression data available from NCBI GEO or other resources. We have used this approach successfully to prioritize GWAS hits for further analysis.

In terms of placing boundaries on your regions of interest, I would say to use LD blocks. We like r2, but also consider recombination hotspots. The following were provided to me by Email from Gilean McVean on 29Jul2009. These are human build 35 and so you may want something more current.

http://www.stats.ox.ac.uk/~mcvean/OXSTAT/GeneticMap_b36/hotspots_b36.txt and http://ftp.hapmap.org/recombination/2006-10_rel21_phaseI+II/hotspots/

In terms of papers looking at re-sequencing as GWAS follow-up, I know I've heard of groups doing this, but no papers come to mind. Could be a sign that this was not a wise approach - meaning that few variants associating with disease risk were found. That was our case for obesity and PLIN1. We found no new (i.e., rare) SNPs associating with obesity.

ADD COMMENT • link 14.2 years ago by Larry_Parnell 16k

0

Entering edit mode

Good stuff, Larry. Thanks. The reason we tend to shy away from r2 is because it may tend to bias us to find variants associated with the discovery (reported) GWAS hit. We are trying to investigate rare variants that may have r2=0 with the original discovery. But how far to look? In the end I may just try to blanket every exon or bit of functional sequence I can get through the genome browser tracks within 500kb of the associated SNP.

ADD REPLY • link 14.2 years ago by Ryan D ★ 3.4k

0

Entering edit mode

Hi Ryan. I am confusing about my post-GWAS application of significant SNPs and yet not get any idea that what to do after finding an association. I am reading a lot of stuff but still not concluded anything. Is re-sequencing of tag SNP region in LD required for further to prove the biological role of the SNP. Any other suggestion which make my further study worthfull? Thanks

ADD REPLY • link 12.1 years ago by kumar.vinod81 ▴ 340

0

Entering edit mode

I'd recommend taking a good look at intron 1 as well as the regions you mention. I know intron 1 can be long, but there is plenty of evidence that regulatory elements reside here.

ADD REPLY • link 14.2 years ago by Larry_Parnell 16k

score 3 · Answer 2 · 2011-02-09

3

Entering edit mode

14.2 years ago

Giovanni M Dall'Olio 28k

Have a look at this draft for a manuscript: Principles for the post-GWAS functional characterisation of risk loci

Since it is still a draft, it can be difficult to read, but you may find many tools and discussion.

ADD COMMENT • link 14.2 years ago by Giovanni M Dall'Olio 28k