Hi, I have a list of NCBI GCA/GCF identifiers for many (hundreds of) vertebrate whole genomes that I would like to download. I initially tried Entrez.efetch, but have realized GCA numbers cannot be used with it, and this also may not be an efficient method for so many large genomes. I'm now looking at using ftplib or the NCBI datasets tool, but I'm new to bioinformatics and I am having trouble understanding the best approach with these. Has anyone done this, and if so, do you have example code you are willing to share? I'm hoping to do this with python/command line. Any help would be greatly appreciated, thanks!
There are
how to
guides available about how to use NCBIdatasets
command line tool: https://www.ncbi.nlm.nih.gov/datasets/docs/v2/how-tos/ with one specifically for large genomes : https://www.ncbi.nlm.nih.gov/datasets/docs/v2/how-tos/genomes/large-download/Please upvote the original post:
Getting a curl: (22) The requested URL returned error: 500 ERROR