Hello,
I would like to convert Human Ensembl transcript coordinates to GRCh38.
I tried to use R package "ensembldb 2.13.1", but it uses the old version of database (EnsDb.Hsapiens.v86) and is not suitable for my data from Ensemble release 100.
Сould advise me some tool for this task?
Sorry if my questions seem strange, I'm new to bioinformatics.
I need to convert exact position in transcript (e.g. position 133 in ENST00000000233.10) to genome coordinate .
I do not need the genomic coordinates of the entire transcript.
That you may need to do yourself and may require writing some custom code. Methods mentioned here will give you the genomics co-ordinates of the entire transcript.
It depends oncgat as well as pandas and numpy. You use it like so:
from cgat import GTF
coord_tobe_translated = pandas.read_csv("mycoords.tsv")
coord_tobe_translated.set_index("ENST")
for transcript in GTF.transcript_iterator(GTF.iterator(open("my_gtf.gtf"))):
converter = TranscriptCoordInterconverter(transcript)
this_transcript_coords = coord_to_be_converted.loc[transcript[0].transcript_id]
genome_coords = converter.transcript2genome(this_transcript_coords.position)
for pos in genome_coords:
print transcript[0].transcript_id, pos
Its a big rusty, written years ago, in python 2.7, but you get the idea. One of these days I'll get round to packing it up as a proper utility.
. Presented "as is". No guarentees implied. Caveat emptor.
Ensembl transcripts should be using the latest genome build. Can you provide examples of what you have that you want to convert?
Now I have a csv-file like this:
and I want to convert this to genome coordinates
Use BioMart https://www.ensembl.org/biomart/martview
Could you please briefly explain me how to use BioMart for convertation? I previously used it just to export data.
If you have used BioMart before just cut the first column of your ID's and use that to restrict your search.
Sorry if my questions seem strange, I'm new to bioinformatics. I need to convert exact position in transcript (e.g. position 133 in ENST00000000233.10) to genome coordinate . I do not need the genomic coordinates of the entire transcript.
That you may need to do yourself and may require writing some custom code. Methods mentioned here will give you the genomics co-ordinates of the entire transcript.
Thenk you for your help!
BioMart would be simpler. Programmatically use REST API: https://rest.ensembl.org/lookup/id/ENST00000000233?content-type=application/json;expand=1
I have python code to do this if it is of any help.
Put it in a GitHub
gist
and link as a answer.I've made my own code already, but it would be great to see the code of a more experienced user, please share it.