Question

How do scientists identify new genes/SNP as candidates for GWAS studies?

1

Entering edit mode

8.4 years ago

nohaseddik ▴ 10

I am working on an automatic extraction tool that should list all SNP published for a given disease ( all associations whether negative or positive). This is an attempt to help scientists target certain SNPs or loci for a specific disease...

I am researching the feasibility of the system so would such a system be helpful? and how else do they identify SNPs ?

gwas SNP gwas catalogue candidate gene • 2.4k views

ADD COMMENT • link updated 8.4 years ago by Fabio Marroni ★ 3.0k • written 8.4 years ago by nohaseddik ▴ 10

score 2 · Answer 1 · 2016-06-28

2

Entering edit mode

8.4 years ago

Emily 24k

Is what you're asking not what the GWAS Catalog already offers?

ADD COMMENT • link 8.4 years ago by Emily 24k

score 0 · Answer 2 · 2016-06-27

0

Entering edit mode

8.4 years ago

emmapead2 ▴ 60

I may be looking at this wrong but if I was in the scenario that I had a sequence from a diseased patient, then identified the SNP's in this sequence I would then use snpeff to annotate the predicted effect of the SNP (synonymous/non-synonymous). This would give me a loci for the gene. I could then go to a SNP specific database to look for any SNP's related to a disease (theres highly specific databases for these studies e.g sheephapmap). If theres nothing published, I'd then go onto Proteomics. To see where in my protein the SNP is effecting and its possible disease mechanism. If its synonymous or in a non coding region I'd then go to models on a population scale i.e is there a relationship between the presence of the SNP and the presence of the disease in the population.

If I'm right in saying what your trying to do it to allow someone to search a disease and list the SNP's. This would pose a rather backward approach to identifying new SNP's. That is, from your search tool you would only return SNP's which we already know about therefore if you used this criteria to search against your sequence data it would not return new SNP's. What most people do is identify your SNP's (.vcf file) annotate and then search for genes/diseases etc. This means you'll have a list of known SNP's and new SNP's from your sequence data and not introducing prejudice when looking for new SNP's.

In short, I don't think your system would be helpful in identifying new SNP's

ADD COMMENT • link 8.4 years ago by emmapead2 ▴ 60

0

Entering edit mode

Hi,

your first paragraph can be replaced by one tool, namely VEP from Ensembl. Except for the disease presence in a population. I wonder where you can get such information from?

On the other hand, @nohaseddik: Inmho, your approach sounds more like GWAS, clinVar & COSMIC. Personally, I think your approach is valid as long as you bring something new other than what the formers do & how you are going to keep it up-to-date.

ADD REPLY • link 8.4 years ago by H.Hasani ▴ 990

0

Entering edit mode

VEP does the same as snpEff, so it just depends on what other tools you are using (for compatibility).

ADD REPLY • link 8.4 years ago by emmapead2 ▴ 60

1

Entering edit mode

In principle yes! However, one can extend the information retrieved by VEP to include other databases (where the mutation was found), & can include the status of mutation on the protein level too.

ADD REPLY • link 8.4 years ago by H.Hasani ▴ 990

0

Entering edit mode

Oh cool, thats good to know, I may have to start using VEP!

ADD REPLY • link 8.4 years ago by emmapead2 ▴ 60

score 0 · Answer 3 · 2016-06-28

I was (marginally) involved in a somewhat similar study. The question in that case was: Imagine I get 20 positive findings from a GWAS, which one should I prioritize for follow up?

It turned out that the prior knowledge that were most helpful were (in order): 1) The SNP was previously identified by MORE than one GWA study for the same (or related) phenotype, 2) The SNP is in a functional protein domain 3) The SNP has been associated to the phenotype in functional model.

The paper is freely available here and if I well remember we also published scripts for doing the information retrieval steps.