Downloading hg38 centromere and telomere positions from UCSC table browser
1
1
Entering edit mode
4.6 years ago
graeme.thorn ▴ 100

I'm converting an R code (from here: https://github.com/cancer-genomics/delfi_scripts) from hg19 to hg38 assembly, and it relies on automatically downloading telomeric and centromeric regions from the UCSC table browser:

genome <- "hg19"
mySession <- browserSession()
genome(mySession) <- genome
gaps <- getTable(ucscTableQuery(mySession, track="gap"))

in order that the resulting fragment calculations don't cover (some of) the less mappable regions of the genome. However, the corresponding code for hg38:

genome <- "hg38"
mySession <- browserSession()
genome(mySession) <- genome
gaps <- getTable(ucscTableQuery(mySession, track="gap"))

does not return the centromere positions from the UCSC table (it has all the telomeric ranges). I have tested the online table browser and that does not return hg38 centromere positions either. Is there another source for these?

ucsc R genome • 9.2k views
ADD COMMENT
0
Entering edit mode

While UCSC support stops by here once in a while you should probably report this directly to them (genome at soe.ucsc.edu) and then provide an update here.

ADD REPLY
0
Entering edit mode

I will provide feedback to them about this. However, I'm just looking for a table of positions in hg38 that I can bolt on to the existing removed regions to ease the workflow.

EDIT: it does look like this is a frequent question to them, see for instance here: https://groups.google.com/a/soe.ucsc.edu/forum/#!msg/genome/SaR2y4UNrWg/XsGdMI3AazgJ

The answer to this query doesn't really help, though.

ADD REPLY
0
Entering edit mode

Hi, still any solution?

I'm looking for the centromeric regions in release GRCh38.

I would need something like: chr start end

I don't understand where I can find it, I looked everywhere and still there is no direct information.

Thanks

ADD REPLY
0
Entering edit mode

See my answer.

ADD REPLY
5
Entering edit mode
4.2 years ago
ATpoint 85k

For centromers, the table browser is the answer.

Select BED as output format, then get output, then whole gene, then get BED, done.

enter image description here

ADD COMMENT
1
Entering edit mode

Hi, thank you for the answer.

Eventually I got what I needed with few adjustments.

Here my solution to obtain centromeric coordinates for hg38:

  • Go to the Table Browser: http://genome.ucsc.edu/cgi-bin/hgTables
  • Choose the Mapping and Sequencing group
  • Select the "Chromosome Band (Ideogram)" track
  • Select filter, and enter "cen" in the gieStain field
  • Press "submit" and then "get output"

Each chromosome will have two entries which overlap.

They can be simply merged into a single entry.

I hope this could be helpful for others.

ADD REPLY
0
Entering edit mode

As of 2021 the filter you want for this approach is "acen" rather than "cen", as noted by Simo above.

ADD REPLY

Login before adding your answer.

Traffic: 2612 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6