Entering edit mode
3.5 years ago
Debut
▴
20
I have a script in python that allows me to download sequences of a species that the user enters the name "input" then his sequences are placed directly in a folder that takes the name of the species. But in zipped format (_genomic.fna.gz
or _protein.faa.gz
). I would like to add a function that allows to zip directly the output files (because it's an important number for example for klebsiella almost 14 000 sequences). Si quelqu'un peut m'aider s'il vous plaƮt( en rajoutant . here is my code :
import os
species = input("Bacteria species ? : ")
TypeSeq= input ("fna ? or faa ? :")
species = input("Bacteria species ? : ")
TypeSeq= input ("fna ? ou faa ?")
if data["#Organism/Name"].str.contains(species, case = False).any():
print(data.loc[data["#Organism/Name"].str.contains(species, case = False)]['Status'].value_counts())
FTP_list = data.loc[data["#Organism/Name"].str.contains(species, case = False)]["FTP Path"].values
if TypeSeq == "faa" :
if not os.path.exists(species):
os.makedirs(species)
for url in FTP_list:
try :
parts = urllib.parse.urlparse(url)
parts.path
posixpath.basename(parts.path)
suffix = "_protein.faa.gz"
prefix = posixpath.basename(parts.path)
print(prefix+suffix)
path = posixpath.join(parts.path, prefix+suffix)
ret = parts._replace(path=path)
sequence = wget.download(urllib.parse.urlunparse(ret), out=species)
except :
print ("")
Please use the formatting bar (especially the
code
option) to present your post better. You can use backticks for inline code (`text` becomestext
), or select a chunk of text and use the highlighted button to format it as a code block. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.Okay, thank you very much.