Hi all!
Just found this entry: Retrieving All Available Frequency Data For A Snp Using Ensembl Api Tools which is very close to what i need. Similar to Krisr I would like to retrieve all population frequency data available from 1000Genomes phase 1 for a SNP, if possible via SQL.
Ensembls Biomart provides minor allele information for the ALL superpopulation only. Pierre Lindenbaum's solution is almost getting me to the desired result - but when I run the sql statement (on homosapiensvariation6937), I only get results from 1000Genoms:pilot_1 - not from phase1.
select distinct V.name, S.handle, A.frequency, M.name, F.allele_string
from ( allele as A, variation as V, subsnp_handle as S, variation_feature as F ) left join sample as M
on (M.sample_id = A.sample_id )
where
V.variation_id=A.variation_id and
S.subsnp_id =A.subsnp_id and
F.variation_id=V.variation_id and
V.name="rs3"
order by 2;
Any suggestions where I could find this data? Alternatively: is there a way to get the sql statements from bioperl - since Bert Overduin provided a nice perl-script (need sql for my workflow) ?
Hi Peixe! Thank you very much for this very interesting web-app!! Just gave it a try - unfortunately there's only data from 1000Genomes Pilot 1 not but not from 1000Genomes Phase 1. Otherwise a very cool and fast application!