Unzipped output files: fasta files in python
0
0
Entering edit mode
3.5 years ago
Debut ▴ 20

I have a script in python that allows me to download sequences of a species that the user enters the name "input" then his sequences are placed directly in a folder that takes the name of the species. But in zipped format (_genomic.fna.gz or _protein.faa.gz). I would like to add a function that allows to zip directly the output files (because it's an important number for example for klebsiella almost 14 000 sequences). Si quelqu'un peut m'aider s'il vous plaƮt( en rajoutant . here is my code :

import os

species = input("Bacteria species ? : ")
TypeSeq= input ("fna ? or faa ? :")   

species = input("Bacteria species ? : ")
TypeSeq= input ("fna  ? ou faa  ?")   

if data["#Organism/Name"].str.contains(species, case = False).any():

    print(data.loc[data["#Organism/Name"].str.contains(species, case = False)]['Status'].value_counts())  
    FTP_list = data.loc[data["#Organism/Name"].str.contains(species, case = False)]["FTP Path"].values

if  TypeSeq == "faa" :

    if not os.path.exists(species):
        os.makedirs(species)

    for url in FTP_list:
        try : 
            parts = urllib.parse.urlparse(url)
            parts.path
            posixpath.basename(parts.path)
            suffix = "_protein.faa.gz"
            prefix = posixpath.basename(parts.path) 
            print(prefix+suffix)

            path = posixpath.join(parts.path, prefix+suffix)
            ret = parts._replace(path=path) 

            sequence = wget.download(urllib.parse.urlunparse(ret), out=species)
        except :
            print ("")
fasta python • 703 views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.
code_formatting

ADD REPLY
0
Entering edit mode

Okay, thank you very much.

ADD REPLY

Login before adding your answer.

Traffic: 1605 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6