Noob needs help:: gathering SNPs, phenotypes,pvalues and pharm
1
0
Entering edit mode
9.1 years ago

Hello,

Am new to bioinformatics but not to hacking code. I'm working on an application and need some guidance on getting the data. I've read a few other posts and still the solution eludes me.

I want to generate the following data from GWAS

Human

all chromosomes, chr 1-22,x,y

max pvalue 1.0

The size of the database doesn't matter. The application/database will be local as it may or may not have internet access to query.

I want an output in this form:

Phenotype,  SNP, pvalue chr#,   start location, annotation

Eventually I want to be able to link in a pharmacognetic database. Here's one of the many variations I've tried

Dataset

Filters

Chromosome : 1
Marker Start (bp): 1
Marker End (bp): 10000000
Greater or equal : -log > = 1

Attributes

Chromosome
Marker Start
p-value
Phenotype Annotation Identifier
Annotation Name
-log p-value

Key Results

Can someone help?

Thanks

pvalues SNP Phenotypes • 2.0k views
ADD COMMENT
0
Entering edit mode

It's difficult to understand exactly what you are trying to do. Are you querying a public database of GWAS results, and not getting any results back using the filter conditions you describe? If so, what GWAS database are you querying exactly?

ADD REPLY
0
Entering edit mode

Thanks for reply.

http://mart.gwascentral.org/biomart/

I'm using the GWASmart db, G2P study. Correct I am get very few results and I would have thought that with a pvalue that large there would be a lot of data? I would also like to specify a minimum of 1500 samples but didn't see where to filter that.

Is there a better database? or what am I doing wrong?

ADD REPLY
1
Entering edit mode
9.1 years ago
Ahill ★ 2.0k

I'm unable to get the Biomart query to return results. Not sure what is going on there, perhaps contact the site admins. Others may have better suggestions on the Biomart query.

You mentioned size of dataset is not a concern, so an alternative that will get you some data is to pull the top 35 results for a Coronary Artery Disease study (Case-control) that was based on > 1500 subjects:

You'll get a 35 row sheet including the fields SNP, pvalue chr#, start location, all for the Coronary Artery disease phenotype.

ADD COMMENT
0
Entering edit mode

Great thank you very much.

I'm glad the site had problems and wasn't my search, I'll send an email to the admin.

ADD REPLY

Login before adding your answer.

Traffic: 1739 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6