Hi,
I downloaded a list of refseq genes
from the table browser
- UCSC
in bed
format. From the bed format description given by UCSC, thickStart and thickEnd means:
thickStart - The starting position at which the feature is drawn thickly (for example, the start codon in gene displays).
thickEnd - The ending position at which the feature is drawn thickly (for example, the stop codon in gene displays).
This has gotten me a bit confused. To explain my confusion look at the following sample bed file from UCSC.
chrI 7741935 8394405 NR_070240 0 + 8394405 8394405 0 8 18,12,13,9,11,11,8,18, 0,209004,270977,272247,461655,519425,544710,652452,
chrI 8378298 8390022 NM_001129046 0 - 8378298 8390022 0 8 123,103,110,116,65,69,124,113, 0,832,1401,2025,9723,9836,10481,11611,
So, what do columns 2 (start) and 3 (end) mean? And how are they different from columns 7 (thickStart) and 8 (thickEnd)? They seem be different in most of the cases! I thought col 2 and 3 mean meant the starting and ending positions of the genes. But the definition of thickStart
and thickEnd
has gotten me confused.
Here is the link to bed file description given by UCSC.
It is surprising that none of the coordinates given from my example are present in output. The genome I have used ce10. Perhaps that's the reason?
opps, updated for ce10...
I didn't know about \G, thanks. You still have '-D hg19'. Also good to note that when cdsStart == cdsEnd, it is a non-coding gene.