Ucsc Track Exonstarts/Exonends To Bed
2
0
Entering edit mode
11.6 years ago
PoGibas 5.1k

Example from UCSC track:

    bin    name    chrom    strand    txStart    txEnd    cdsStart    cdsEnd    exonCount    exonStarts    exonEnds
    73    TCONS_l2_00002368    chr1    -    89294    237877    237877    237877    2    89294,236614,    90404,237877,
    585    TCONS_l2_00002369    chr1    -    89550    91105    91105    91105    2    89550,90286,    90050,91105,

How to combine exonStarts with corresponding exonEnds to get only exon's bed file?

   chr1    89294    90404
   chr1    236614    237877
   chr1    89550    90050
   chr1    90286    91105
bed ucsc • 2.3k views
ADD COMMENT
2
Entering edit mode
11.6 years ago
Duarte Molha ▴ 240

Install bedtools...

instead of trying to do that from the track, download the track as a bed12 file and use the command bed12ToBed6 to get all exon coordinates:

bed12ToBed6 -i UCSC_track.bed > UCSC_discrete_features.bed

or

bedtools bed12tobed6 [OPTIONS] -i UCSC_track.bed > UCSC_discrete_features.bed

If this is not possible a relatively simple awk script should do the trick:

awk 'BEGIN{OFS="\t"}{n_exons = split($10,e_starts,",");split($11,e_ends,",");for(i=1;i<n_exons;i++){print $3,e_starts[i],e_ends[i]}}' t.track > exons.bed
ADD COMMENT
0
Entering edit mode

I was using bedtools for some time and didn't know it had such a cool feature.

ADD REPLY
0
Entering edit mode

I specifically created a biostars account so that I could upvote this reply. Thank you very much!

ADD REPLY
1
Entering edit mode
11.6 years ago
AndreiR ▴ 260

Hi, Think it may help you :
Exon coordinates of hg19 genome download

ADD COMMENT

Login before adding your answer.

Traffic: 1791 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6