Retrieving data of all exons from UCSC using MySQL
1
0
Entering edit mode
8.2 years ago
tiago211287 ★ 1.5k

I am using this command bellow to retrieve data from all exon start and all exon end plus 10 bp from each end. However, looking the output I saw several genes with only the first exon of that gene. What I am doing wrong?

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19  -N -e 'select chrom,exonStarts-10,exonEnds+10,name2,strand from refGene '
ucsc mysql • 2.1k views
ADD COMMENT
0
Entering edit mode

If I recall correctly, exonStarts and exonEnds are text columns with comma separated values, right? How is your query supposed to work on such data?

ADD REPLY
0
Entering edit mode

Dont know, I collected information from several posts to make this command. The first time I hear that, makes sense.

ADD REPLY
1
Entering edit mode

Did you happen to add the +10 part before testing the original query? You should try and run an unaltered query so it's easier to figure out when something stops working.

ADD REPLY
1
Entering edit mode

No, because I did not knew this field was comma separated. Indeed this solved it all. Thanks.

ADD REPLY
0
Entering edit mode

Be useful to leave a not about what solved it all for folks who happen on this thread via search in future.

ADD REPLY
1
Entering edit mode
8.2 years ago
tiago211287 ★ 1.5k

Answering my own question:

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19  -N -e 'select chrom,exonStarts-10,exonEnds+10,name2,strand from refGene '

The exon start and end column has all exons in comma separated fields, so adding 10 to it will make only the first value to be showed up. So the better thing to do is use like this:

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19  -N -e 'select chrom,exonStarts,exonEnds,name2,strand from refGene ' > hg19.genes

If one wants to get each exon per row, you need to use this awk command :

awk '{ n = split($2, a, ","); split($3, b, ","); for(i=1; i<n; ++i) print $1, a[i], b[i], $4, $5 }' h19.genes > h19.genes.bed
ADD COMMENT

Login before adding your answer.

Traffic: 2639 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6