Entering edit mode
3.5 years ago
Debut
▴
20
I'm in a bind please. do you know if i can duplicate the last folder in the path please and add "_genomic.fna.gz" to it for example how to change from this """ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/316/945/GCA_001316945.3_ASM131694v3"""" to this : """ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/316/945/GCA_001316945.3_ASM131694v3/GCA_001316945.3_ASM131694v3_genomic.fna.gz""""" Thanks
What have you tried? Please explain your problem to us in as much detail as you can.
I have loaded this file: "https://ftp.ncbi.nlm.nih.gov/genomes/refseq/bacteria/assembly_summary.txt" in a dataframe with pandas. I would like to download for example from FTP Path that there is in this table in ".fna.gz" or ".faa.gz" format. for example to download as .fna.gz it is the example that there is above: from this """ ftp://ftp.ncbi.nlm.nih.gov/genomes /all/GCA/001/316/945/GCA_001316945.3_ASM131694v3 """" to this: """ ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/316/945/ GCA_001316945.3_ASM131694v3/GCA_001316945.3_ASM131694v3_genomic.fna.gz "" with the second link the download is done. So there are two solutions: the first solution is to add after the ftp_path: +"/"(slash)+duplicated the last folder by adding "_genomic.fna.gz or by adding after the ftp_path : +"/"+ assembly accession (in the table) and then adding "_" then asm_name. This is for each line, i.e. for a given line: its ftp_path with its asm_name and its assembly accession.
I don't know if it's possible, I couldn't find any script lines that would allow me to do this. Please
You've figured out two viable approaches to solving your problem, try one or both of them on a few entries and if they work, go with it. I don't see where you need any help with this - you've got it!
While I am not addressing your question there are already tools that allow you to
selectively
download data from NCBI genomes (sounds like that is where you are headed). Save time and use them: How to download all Pseudomonas aeruginosa Genomes from NCBI Genomes database?