How to convert protein IDs to nucleotide acc id in eutilities ?
1
0
Entering edit mode
4.3 years ago
sankadinesh ▴ 20

I have a excel sheet with protein ID. I want to convert them to nucleotide id. How to do this using eutilies or any other method. Please reply. Thanks Regards, Dinesh

sequencing gene • 1.1k views
ADD COMMENT
0
Entering edit mode

Dear sir, Thank you. Can you please suggest me how to convert list of protein IDs (10000 IDs) rather one ? Thanks once again Regards, Dinesh S L

ADD REPLY
0
Entering edit mode

Provide examples of a few.

ADD REPLY
1
Entering edit mode

Dear Sir, I figured out. This one is working

 epost -db protein -input /Users/apple/Desktop/test.rtf -format acc | elink -target nuccore | efetch -format acc

AAA21838
AAA22008
AAA22014
AAA26329
AAA26332
AAA87251
AAA93020
AAA96262
AAB36876
AAB41123

The result was

AH000925.2
L34879.1
L04499.1
U53363.1
L47979.1
AH000924.2
L41344.1
U49859.1
J05111.1

But for one ID (AAA21838), there is no nucleotide ID. I wont be able to know which protein IDs did not yield nucleotide IDs in case of thousands of IDs. Thanks Regards, Dinesh S L

ADD REPLY
0
Entering edit mode

But for one ID (AAA21838), there is no nucleotide ID.

That is not correct.

$ esearch -db protein -query "AAA21838" | elink -target nuccore | efetch -format acc
L23514.1

Use a variation like this so you will know which ID's did not return a value. First column is your source ID's.

for i in `cat acc_file`; do printf ${i}"\t"; esearch -db protein -query ${i} | elink -target nuccore | efetch -format acc;  done

AAA21838    L23514.1
AAA22008    J05111.1
AAA22014    L04499.1
AAA26329    AH000924.2
AAA26332    AH000925.2
AAA87251    L34879.1
AAA93020    U49859.1
AAA96262    L41344.1
ADD REPLY
0
Entering edit mode

Dear sir, Thanks for spending your valuable time. First I tried like this,

Dinesh$ for i in 'cat /Users/apple/Desktop/testcopy.rtf' ; do printf ${i}"\t" ; esearch -db protein -query ${i} | elink -target nuccore | efetch -format acc;  done

catEntrez Direct does not support positional arguments.
Please remember to quote parameter values containing
whitespace or shell metacharacters.
Db value not found in link input
Db value not found in fetch input

If I give like this,

Dinesh$ for i in cat /Users/apple/Desktop/testcopy.rtf ; do printf ${i}"\t" ; esearch -db protein -query ${i} | elink -target nuccore | efetch -format acc;  done

It is giving a big list of random IDs followed by the following message

CVRY01000005.1
CVRY01000006.1
NT_086364.3
NT_086333.1
/Users/apple/Desktop/testcopy.rtf   Retrying elink, step 2: Empty result - nothing to do
Retrying elink, step 2: Empty result - nothing to do
Retrying elink, step 2: Empty result - nothing to do
ERROR in link output: Empty result - nothing to do
WebEnv: NCID_1_38538961_130.14.22.76_9001_1596752963_1453260018_0MetA0_S_MegaStore
URL: dbfrom=protein&db=nuccore&query_key=1&WebEnv=NCID_1_38538961_130.14.22.76_9001_1596752963_1453260018_0MetA0_S_MegaStore&cmd=neighbor_history&linkname=protein_nuccore
Result: 
https://eutils.ncbi.nlm.nih.gov/eutils/dtd/20101123/elink.dtd">
<eLinkResult>
<LinkSet>
    <ERROR>Empty result - nothing to do</ERROR>
</LinkSet>
</eLinkResult>


ERROR in fetch input: Empty result - nothing to do

Herewith, I am attaching the input file link *https://www.dropbox.com/s/nuguyjiupnabd3j/testcopy.rtf?dl=0*

Thanks once again and Regards, Dinesh SL

ADD REPLY
0
Entering edit mode

Make sure there is one ID per line. You need to use (backtick) not plain single quote around the cat file command. There should be no extraneous formatting characters (I see that you are using a RTF format file, use plain text). If you are making the file up on a windows machine and then moving it over to unix then pass it through dos2unix program.

ADD REPLY
0
Entering edit mode
4.3 years ago
GenoMax 147k

Since you don't provide any examples I used a random one.

$ esearch -db protein -query "NP_611925.2" | elink -target nuccore | efetch -format acc
NT_033778.4
NM_138081.4

If you have a number of them you could use a loop to go through your list with command above or use epost method.

ADD COMMENT

Login before adding your answer.

Traffic: 2345 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6