convert gbk to fasta
3
0
Entering edit mode
4.9 years ago

Hi, guys!

I'm a bioinformatics intern and I'm looking for a script that converts several .gbk files to .fa files in a directory. I already tried in several ways to do but I did not get any results.

Can anybody help me?

genome • 5.1k views
ADD COMMENT
1
Entering edit mode

If you know python you can use SeqIO to read the gbk and write in gasta format

ADD REPLY
0
Entering edit mode

wanted to put to read several gbk files in a folder

ADD REPLY
1
Entering edit mode

A for loop then? Where did you get the data from? Are you sure there aren't fasta files available?

ADD REPLY
2
Entering edit mode
4.9 years ago
Joe 21k

You don't need to mess about with parse and write as there is a convenience function for this (unless you want to specifically control the metadata that gets written):

To run over an entire folder:

for file in /path/to/dir/*.gbk ; do
    python -c "from Bio import SeqIO; SeqIO.convert($file, genbank, ${file%.*}.fasta, fasta);"
done
ADD COMMENT
0
Entering edit mode

Thanks Joe! This was super useful.

I added some quotes to make it work for me:

for file in *.gbk ; do                                                              
    python -c "from Bio import SeqIO; SeqIO.convert('$file', 'genbank', '${file%.*}.fasta', 'fasta');"
done
ADD REPLY
1
Entering edit mode
4.9 years ago
from Bio import SeqIO
import os, sys

for raiz, subpasta, arquivo in os.walk(
        '/Documentos/parse/GCF_000231365.1/AntiSmash/GCF_000231365.1_ASM23136v1_genomic$'
):
    origem = os.path.join(raiz, arquivo)
if origem.endswith(".gbk"):
    with open(origem, "rU") as input_handle:
        destino = origem.replace(".gbk", ".fa")
with open(destino, "w") as output_handle:
    sequences = SeqIO.parse(input_handle, ".gbk")
    count = SeqIO.write(sequences, output_handle, ".fasta")
if len(sys.argv) != 3:
    sys.exit(__doc__)
    output_handle.write(
        ">% s de% s \ n% s \ n" % (seq_feature.qualifiers['locus_tag'][0],
                                   seq_record.name,
                                   seq_feature.qualifiers['tradução'][0]))
output_handle.close()
input_handle.close()
ADD COMMENT
0
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 2062 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6