taxid to genome refseq accession number
3
0
Entering edit mode
6.4 years ago

Dear all, I have a list of taxids like: 10243 10244 10246 10247 10248 10249

And I am looking for their corresponding RefSeq Genome Accession Numbers. One example I manually searched was for taxid 10243 , genome refseq accession number is NC_003663.2 . Please guide. thanks.

genome taxid refseq accession number genome • 4.8k views
ADD COMMENT
4
Entering edit mode
6.4 years ago
Sej Modha 5.3k

The easiest way is to search against the nuccore database and limit the search against refseq using filter.

For example,

esearch -db nuccore -query "txid10242[Organism:exp] AND refseq[filter]"|efetch -format acc
NC_037656.1
NC_031033.1
NC_031038.1
NC_003663.2
NC_006998.1
NC_027213.1
NC_008291.1
NC_004105.1
NC_003391.1
NC_003310.1
NC_001611.1

esearch -db nuccore -query "txid10243[Organism:exp] AND refseq[filter]"|efetch -format acc
NC_003663.2
ADD COMMENT
0
Entering edit mode

My system is not supporting these utilities, as a result it shows command not found error. Can we get some curl/wget link to get NC_XXX data for each taxid. Would any other way round be possible?

ADD REPLY
0
Entering edit mode

You'd have to install these utilities on your computer and it can be downloaded from: ftp://ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/

You might also find this eutils tutorial helpful.

ADD REPLY
1
Entering edit mode
ADD COMMENT
0
Entering edit mode

The solution given there seems to help in fetching GI numbers, which is not what I require. I need is whole genome Refseq Accession number for each taxid. Thanks anyways for help.

ADD REPLY
0
Entering edit mode

Did you miss that part?

Since you want accession numbers add step 4a: Under "Summary" on left side of the page choose "Format" --> "Accession list".

ADD REPLY
0
Entering edit mode

Yes, sure. It's a manual way of doing, I am looking for a script /program as the id list exceeds lakhs. Once again thanks for your help.

ADD REPLY
1
Entering edit mode
6.4 years ago

Thank you all for your kind help and direction.

I have however utilized a different approach to gather information for acc. no.s , as my system couldn't install efetch and esearch (eutilities).

Also, manual way was inpossible for such a huge dataset.

My work is although a liitle exhaustive but had helped me so sharing with others for knowledge, just in case required:

wget url:

wget "https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=10234&lvl=3&lin=f&keep=1&srchmode=1&unlock"

Here I have replaced my taxid with $i which it read from list as,

for i in `cat list`; do wget "https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=**"$i"**&lvl=3&lin=f&keep=1&srchmode=1&unlock" ; done

then an index file forms like index_******_ taxid_*****

grep -E "Scientific name|/genome/?term=txid""$i" wwwtax.cgi\?mode\=Info\&id\=**"$i"**\&lvl\=3\&lin\=f\&keep\=1\&srchmode\=1\&unlock >Details_$i

will save in Detais_$s the details of taxids whose genome is available, such as taxid 10244 : https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=10244&lvl=3&lin=f&keep=1&srchmode=1&unlock

has and this id : 10234,

https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=10234&lvl=3&lin=f&keep=1&srchmode=1&unlock

doesn't.

So grep will keep all that saved in Details file, from details get their NC_**** acc numbers using the following url:

https://www.ncbi.nlm.nih.gov/genome/?term=txid10244[Organism:exp]

Hope this might help someone in future too, or someone may improve this to make it more organised.

Thanks once again biostars, especially Sej Modha and genomax for your help and kind guidance.

Thank you

ADD COMMENT
0
Entering edit mode

Hello ruchikabhat31,

thank you for giving response and detailed description of your final solution.

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.

code_formatting

Thank you!

ADD REPLY
0
Entering edit mode

Thank you finswimmer, for your help this time. I shall keep that in mind for the next time.

ADD REPLY

Login before adding your answer.

Traffic: 2567 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6