Any quick methods to get ensemble gene_id, gene_symbol for a list of co-ordinates (transcript), this post A: Identify gene symbols given a list of chromosome positions talks about ucsc only.
Any quick methods to get ensemble gene_id, gene_symbol for a list of co-ordinates (transcript), this post A: Identify gene symbols given a list of chromosome positions talks about ucsc only.
I'd be surprised if this couldn't be done with biomart.
In fact, here is a very simple example using GRCh38.
BTW, if for some reason you really want a python-based solution then download a GTF file and:
pip install deeptools
then in python
from deeptoolsintervals import GTF
anno = GTF("foo.gtf", transcriptID="gene_id", transcript_id_designator="gene")
anno.findOverlaps("chr1", 1, 1000)
That will get you the gene_id
field and coordinate information. The python wrapper doesn't allow access to the symbol, so you'd need to just download the mapping from biomart.
If you don't want to perform a bunch of remote queries then something along those lines would work. I never really intended for others to use that python module, but if you ever want to it's documented here.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hello,
you could use ensembl's REST-Api for this, e.g. the Overlap endpoint.
fin swimmer
any snippet to query rest api using python?