Entering edit mode
7.5 years ago
Paul
▴
80
How to download all the complete genomes for mycobacteria from NCBI?
I tried downloading the complete genomes from the NCBI site
ftp://ftp.ncbi.nlm.nih.gov/genomes/GENOME_REPORTS/
But couldn't get the exact fasta files with respective mycobacteria. And https://www.ncbi.nlm.nih.gov/genome/?term=mycobacteria gave me 421 hits
Thanks.. This worked
I tried your method but I have an empty urls.txt file. has the format changed please?
It hasn't changed. I just tried the above and see 2,481 Mycobacter genomes with the status "Complete Genome"..
OKAY, THANK YOU FOR YOUR ANSWER.
please, is it possible to put all the output sequences in one file (file with several FASTA files) ?
thank you very much for your answer. but i have 10668 outputs it doesn't have a command to add for example after "IFS=$'\n'; for NEXT in $(cat urls.txt); do wget "$NEXT"; done" i tried IFS=$'\n'; for NEXT in $(cat urls.txt); do wget "$NEXT"; done >doc.txt" it didn't work
The output files all end in ".gz", right?
So
zcat *.gz > all.fna
zcat
instead ofcat
because they're gz archievesHi, I'm trying to do this with python, I've already loaded my table with pandas, and I'd like to do the same thing I've got the FTP Path back but I have to go from :""ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/316/945/GCA_001316945.3_ASM131694v3"""" to this : """ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/316/945/GCA_001316945.3_ASM131694v3/GCA_001316945.3_ASM131694v3_genomic.fna.gz""""" Thanks