I have several ENSEMBL IDs for which I am not able to find the corresponding NCBI Gene IDs. Is there any way to find the NCBI Gene IDs ?. I want to extract the nucleotide and amino acid sequences for those IDs.
For example 1, when I searched the following IDs in ENSEMBL website.
ENSCCRG00010004296
ENSCSEG00000016380
ENSCCRG00010032853
ENSEBUG00000005910
ENSGAFG00000018062
For all the above ENSEMBL Gene IDs, I got the same output shown below
Source:NCBI gene;Acc:334648
Similarly, I have some other IDs, when I enter the following IDs in ENSEMBL website
ENSETEG00000011291
ENSEASG00005014496
ENSEEUG00000002917
For all the above ENSEMBL Gene IDs, I got the same output shown below
HGNC:18416
I am not sure how to extract the nucleotide and protein sequences for these ENSEMBL IDs
Could you please tell me if there is there any way I can find the sequence information for these? I tried biomart. But it did not work
Thinking about this some more, there is no easy way to do this using BioMart since BioMart allows one to select only one species at a time and these identifiers are all from different species. You could try using Ensembl API.
https://rest.ensembl.org/sequence/id/ENSEASG00005014496%20?content-type=text/cds
https://rest.ensembl.org/sequence/id/ENSEBUG00000005910?content-type=text/cds
I think you are best off getting the sequence from NCBI's homologene page after you decide which species you are interested in from Ensembl. Here is the complete list Ensembl species prefixes.
Thank you for the suggestions on NCBI homologene and Ensembl species prefixes website. I found some from this website OMAbrowser.org ENSGMOG00000017853. But not sure If those are correct information.
I tried using the ENSEMBL API For the ID
ENSAPOG00000007174
here. There was NCBI Gene ID110954326
found for the ENSEMBL ID here and the FASTA sequence from NCBI here. However, the sequences obtained from NCBI and ENSEMBL API do not match. Could you tell me if am doing something wrong? @genomax. Thank you.They are same sequence. See this blast2sequence result. Ensembl sequence may have 3'-UTR.