Retrieving promoter regions from ucsc using mysql
3
0
Entering edit mode
8.4 years ago
annerionta93 ▴ 10

Hello there,

I think this is a really easy question for anybody with experience using mysql and the UCSC browser.

I’m trying to retrieve all “promoter” regions where promoter is defined as X bp before the TSS.

I know how to do this using the ucsc browser however the output is displayed on screen and I would like instead to download a bed file. Is there a button I'm missing somewhere?

Also, I would rather do this programmatically using a my sql query (I think this link shows how to do it using R)

Many Thanks!

genome ucsc • 2.6k views
ADD COMMENT
3
Entering edit mode
8.4 years ago

Here's another strategy, since the OP asks for mysql. Get promoters as defined by TSS +/- 1000 bp from refseq for hg19, forward transcripts:

mysql -N --user=genome --host=genome-mysql.cse.ucsc.edu -A -e \
    "select chrom, (txStart + 1) - 1000, (txStart + 1) + 1000, name, name2, strand 
     from hg19.refGene
     where strand = '+'" > proms.txt

And reverse transcripts:

mysql -N --user=genome --host=genome-mysql.cse.ucsc.edu -A -e \
    "select chrom, txEnd-1000, txEnd+1000, name, name2, strand 
     from hg19.refGene
     where strand = '-'" >> proms.txt
ADD COMMENT
0
Entering edit mode

This is exactly what I had in mind! Thanks dariober! I edited the answer a little bit to just get the upstream bp (instead of a flanking region)

mysql -N --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select chrom, (txStart + 1) - 1000, (txStart + 1), name, name2, strand from hg19.refGene where strand = '+'" >proms_plus.bed

and

mysql -N --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select chrom, txEnd, txEnd+1000, name, name2, strand from hg19.refGene where strand = '-'" > proms_neg.bed
ADD REPLY
2
Entering edit mode
8.4 years ago
Denise CS ★ 5.2k

Regions with promoter (or enhancer, TFBS) activity have been annotated in Ensembl through our Regulatory build based on the ENCODE, Roadmap and Blueprint biochemical data. It's available for both GRCh38 and GRCh37 ((=hg19) and it can be accessed with MySQL although we would recommend using our APIs (Perl or REST). This is the Regulation schema and the API tutorial.

ADD COMMENT
1
Entering edit mode

Thanks Denise. I'm definitely spending some time learning the Ensemble API today.

ADD REPLY
0
Entering edit mode

Fab! If you get stuck, get in touch with our helpdesk. You may also sign up to our dev forum.

ADD REPLY
0
Entering edit mode
8.4 years ago
GenoMax 146k

If the output is displayed in your browser window you should be able to right click on the page and do "save page as" (or an equivalent command for the browser you are using). If you are using UCSC table browser then providing a file name makes the output go to a file instead of the screen.

ADD COMMENT
0
Entering edit mode

Thanks! This was easy enough.

ADD REPLY

Login before adding your answer.

Traffic: 1186 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6