Question

Help Interfacing With Ucsc Annotations Using Mysql

0

Entering edit mode

12.3 years ago

afkoeppel • 0

Hello everyone,

I am doing some homozygosity-mapping on the zebrafish genome. I have generated a list of SNP positions that have scored as highly homozygous. What I want to do next is filter that list to just those positions that are within coding regions, and create an amino acid change.

I understand that one way to do this is to use mysql to interface with the UCSC genome informatics website, but I have very little experience with mysql, and wanted to get some advice.

I saw the posting here: SNP Query from UCSC which looks like it is somewhat similar to what I want to do, except that I already have the SNPs, and I want the annotations at the SNP positions (also the zebrafish genome rather than human).

Is there a way to feed UCSC (via mysql) a vcf file or list of SNP positions and have it return the genes/transcripts that overlap those positions?

Thanks so much

Alex

ucsc mysql snp • 3.7k views

ADD COMMENT • link updated 12.3 years ago by Andy Yates ▴ 120 • written 12.3 years ago by afkoeppel • 0

1

Entering edit mode

Sure there is a way: it requires that you have ( or acquire) the skills to write code that can (1) read and parse the input file, (2) construct an SQL query using variables obtained from that input and (3) process the output. So my advice would be: if you are motivated to learn those skills, go for it. Otherwise, some of the answers below outline quicker, simpler approaches.

ADD REPLY • link 12.3 years ago by Neilfws 49k

0

Entering edit mode

See my answer to the following question: Is there such a thing as a UCSC API?.

ADD REPLY • link 12.3 years ago by lh3 33k

score 1 · Answer 1 · 2013-04-30

Hi there

Have you considered using Ensembl's variant effect predictor? You can find more information from here http://www.ensembl.org/info/docs/variation/vep/index.html. However just for reference you can submit upto 750 snps to our website interface, you can download the script and run it locally or use our rest service.

The script version is especially cool as you can download caches from ensembl and annotate snps without communicating with any ensembl MySQL database. Also it is possible to annotate against gene annotations other than the ones ensembl provide. Should you need anymore information then give me a buzz or contact our helpdesk. More details are available from http://www.ensembl.org/info/about/contact/index.html.

Good luck and I hope this has helped

score 0 · Answer 2 · 2013-04-30

0

Entering edit mode

12.3 years ago

Sean Davis 27k

If you just want to see overlaps, look at bedtools, bedops, or even galaxy if you like a gui. That said, I'd suggest looking at snpEff or the ensembl variant effect predictor for this task.

ADD COMMENT • link 12.3 years ago by Sean Davis 27k

score 0 · Answer 3 · 2013-04-30

0

Entering edit mode

12.3 years ago

Pavel Senin ★ 1.9k

I guess, you can get an annotation of Zebrafish here http://www.ensembl.org/info/data/ftp/index.html and do your analyses in command line with SNPEff/SNPSift easily http://snpeff.sourceforge.net/SnpSift.html.

ADD COMMENT • link 12.3 years ago by Pavel Senin ★ 1.9k