Difficulty retrieving NCBI taxonomic IDs from multiple accession IDs
1
Dear All,
I am trying to extract taxids from a file containing NCBI accession IDs; I have about 20,000 accession IDs.
I have tried using this link to help me:
1.) A: NCBI Accession Number to Taxonomy ID
This did not work and I came up with this error message: ERROR in fetch input: Search Backend failed: read request has timed out. peer: 130.14.18.27:7011
2.) I have heard of the file: ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/nucl_gb.accession2taxid.gz . I tried to perform some grep functions but to no avail, and I could not find a script that helps parse thousands of accession IDs.
Would really appreciate if anyone could help.
Thanks!
ncbi
• 2.3k views
•
link
updated 5.7 years ago by
GenoMax
147k
•
written 5.7 years ago by
Ming
▴
110
2.) I have heard of the file: ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/nucl_gb.accession2taxid.gz .
I tried to perform some grep functions but to no avail, and I could
not find a script that helps parse thousands of accession IDs.
If you downloaded that file then you can easily find taxID using grep. Your accessions are will be in file called acc
.
$ more acc
X68822
Z18640
Z18643
Z18642
Z18647
Z18649
Z18665
Z18651
X02323
Z18653
X59440
Z18654
X56823
X56218
X68287
$ for i in `cat ./acc`; do zgrep -m1 -w "$i" nucl_gb.accession2taxid.gz; done
X68822 X68822.1 9731 1118
Z18640 Z18640.1 9731 1121
Z18643 Z18643.1 27615 1128
Z18642 Z18642.1 27615 1129
Z18647 Z18647.1 27610 1130
Z18649 Z18649.1 27611 1135
Z18665 Z18665.1 27613 1137
Z18651 Z18651.1 27616 1151
X02323 X02323.1 9887 1160
Z18653 Z18653.1 9773 1161
X59440 X59440.1 9668 1163
Z18654 Z18654.1 27617 1164
X56823 X56823.1 9886 1165
X56218 X56218.1 452646 1168
X68287 X68287.1 9940 1194
Third column contains the taxID. I will leave it to you as to how to get only the accession and taxID (hint: use cut
or awk
).
Login before adding your answer.
Traffic: 1955 users visited in the last hour
See my comment here: C: NCBI Accession Number to Taxonomy ID
This appears to be a local issue with your firewall restrictions.