Hi everyone,
I am trying to download "vertebrate" GI list from NCBI Entrez protein, but it seem to be impossible (but other categories like Summary or FASTA can be downloaded normally), here is the link:
https://www.ncbi.nlm.nih.gov/protein/?term=%22vertebrates%22%5Bporgn%3A__txid7742%5D
Does anyone have the same problem? And do you know how to fix this? Thank you very much in advance.
You are trying to download 12,080,735 gi's. I think that the process will time out. You are better off using a local script on a gi_taxid mapping file from here: ftp://ftp.ncbi.nih.gov/pub/taxonomy
Thanks for your suggestion!
I seem to have the same problem that I cannot download the accession numbers of a large taxonomic group (in my case Insecta). Could you give more details on how to accomplish this using these listed files? Are all accession numbers available here, and how do I retrieve them for my taxonomic group of choice? Thanks a lot in advance!
See this: A: How to retrieve any and all NCBI/GenBank accession numbers from a Taxonomy ID?
Thanks for the link, yet downloading the accession list file from ncbi seems not be working for me (too big file?). Is there a way to circumvent this? It seems the solution is here: ftp://ftp.ncbi.nih.gov/pub/taxonomy. Yet given my limited bioinformatic knowledge the readme files are not very clear to me...
For my connection, downloading accession list took very long time (it is limited to 2-3kb/s). It took me nearly 1 day to download 12 million accession ids of Vertebrates. You can ask your friend or someone on Biostar with better connection to download Insecta accession list and then send it to you.