Hello everyone,
I am doing some homozygosity-mapping on the zebrafish genome. I have generated a list of SNP positions that have scored as highly homozygous. What I want to do next is filter that list to just those positions that are within coding regions, and create an amino acid change.
I understand that one way to do this is to use mysql to interface with the UCSC genome informatics website, but I have very little experience with mysql, and wanted to get some advice.
I saw the posting here: SNP Query from UCSC which looks like it is somewhat similar to what I want to do, except that I already have the SNPs, and I want the annotations at the SNP positions (also the zebrafish genome rather than human).
Is there a way to feed UCSC (via mysql) a vcf file or list of SNP positions and have it return the genes/transcripts that overlap those positions?
Thanks so much
Alex
Sure there is a way: it requires that you have ( or acquire) the skills to write code that can (1) read and parse the input file, (2) construct an SQL query using variables obtained from that input and (3) process the output. So my advice would be: if you are motivated to learn those skills, go for it. Otherwise, some of the answers below outline quicker, simpler approaches.
See my answer to the following question: Is there such a thing as a UCSC API?.