Retrieve nucleotide ID from protein ID
1
0
Entering edit mode
2.5 years ago
kmears • 0

Hello,

I have a list of proteins for which I need to find the corresponding nucleotide sequences. For example find NC_001498 from NP_056922.1. I figure there should be an easy way to do this but I'm not familiar with the ins and outs of Entrez/biopython.

Thanks!

refseq entrez bipython • 862 views
ADD COMMENT
0
Entering edit mode
2.5 years ago
vkkodali_ncbi ★ 3.8k

You can use EntrezDirect for this as follows:

$ epost -db protein -id NP_056922.1 | elink -target nuccore | efetch -format acc
NC_001498.1

Alternatively, you can use Batch Entrez to upload your list of protein accessions, retrieve the proteins and then using the "Find related data" widget on the right-hand side to get a list of all related Nucleotide records.

ADD COMMENT
0
Entering edit mode

Thank you, is there an easy way to do this with python as well?

ADD REPLY
0
Entering edit mode

Yes, take a look at Bio.Entrez. Personally though, I just find it easier to use subprocess module in python and run the EntrezDirect commands within the python script... but that may not always be feasible.

ADD REPLY

Login before adding your answer.

Traffic: 2156 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6