Converting Ensembl ID to NCBI ID
3
1
Entering edit mode
4.8 years ago
botloggy ▴ 10

I am trying to convert a list of different species Ensembl Gene IDs to NCBI Gene IDs. I found out that, given a list of same species Ensembl Gene Ids. We can obtain the NCBI Gene IDs through biomaRt Converting gene names.

However, I am not sure to obtain the list of NCBI Gene IDs, given the large list of Ensembl Gene IDs belonging to different species.

For example:

ENSAPLG00000002137
ENSATEG00000009947
ENSACAG00000001379

Can anyone suggest if there is any approach to automatically retrieve the NCBI gene IDs?

gene R sequence genome • 11k views
ADD COMMENT
1
Entering edit mode

You could try db2db resource for ID conversion.

ADD REPLY
0
Entering edit mode

This tool is good. But, when I input my Ensemble gene IDs, only few of the GENE ID are being retrieved. Some of the Ensemble IDs arent working giving any output.

Could you please tell me if there is any other powerful alternative tool / script, through which i can extract the gene ID.

ADD REPLY
0
Entering edit mode

Hello Abhijeet Patil!

It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/11714/converting-ensembl-id-to-ncbi-id

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY
0
Entering edit mode

Yes, I usually don't do that. My bad I should have never posted on that site as I knew I won't be getting a response from there. The bioinformatics stackexchange community is relatively new and I had posted the question more than 24 hours ago and have only 4 views till now. That is the reason I posted the question here.

ADD REPLY
1
Entering edit mode

It's not just that, but your title makes your question seem like a pretty standard question - something a lot of people face and would need to learn on their own. Given that you're facing specific edge cases where the ID translation runs into problems, mentioning that might have gotten more interaction from people.

ADD REPLY
0
Entering edit mode

Thanks for the suggestion.

ADD REPLY
5
Entering edit mode
2.6 years ago

gget info should return the UniProt and NCBI gene ID if available.

pip install gget, then simply:

# Command-line
gget info ENSAPLG00000002137 ENSATEG00000009947 ENSACAG00000001379
# Python
import gget
gget.info(["ENSAPLG00000002137", "ENSATEG00000009947", "ENSACAG00000001379"])
ADD COMMENT
1
Entering edit mode
4.8 years ago
Hugo ▴ 380

I guess you can do that conversion with the UniProt "Retrieve / ID Mapping" utility available at https://www.uniprot.org/uploadlists/

I have been using it for a while to convert between different database identifiers with success.

Regards,

Hugo.

ADD COMMENT
0
Entering edit mode

It's not working. Moreover, if I select the input as the Ensemble Genomes, there is no option to output the Gene IDs

ADD REPLY
1
Entering edit mode

Right, in most cases you have to do a two-step conversion: first from the source (Ensemble) to UniProtKB and then from UniProtKB to other format.

ADD REPLY
0
Entering edit mode

Yes, I thought the same and tried that, but from the list of genes I had, I could only find a few UniProtKB IDs. Thanks

ADD REPLY
0
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 1950 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6