Question

Need Help: How can I perform Emboss-sixpack on multi-sequences fasta file

0

Entering edit mode

5.5 years ago

shiv ▴ 10

Hello everyone,

I have ~10 fasta files and each file contain more than 50 sequences and I want to get the information about 6-frame translation and ORFs for each sequences, I found Emboss-sixpack to do this work for me but when I went through the manual I got to know that it takes only single sequence as input file. Can you please suggest me with other options of this (may be I've missed) and is there any alternate option to do this thing without splitting the files ..

Thanks in advance

Assembly gene • 2.0k views

ADD COMMENT • link updated 5.5 years ago by Michael 55k • written 5.5 years ago by shiv ▴ 10

score 1 · Answer 1 · 2019-06-20

1

Entering edit mode

5.5 years ago

Michael 55k

This could be a workflow combining your favorite solution of:

How To Split A Multiple Fasta or this aw(k)esome code by Pierre: A: Is there a way to split single .txt file with multiple fasta sequences into indi
Bash Loop For Job Submission and here: A: Bash Loop For Job Submission (needed to be fully parameterized, otherwise sixpack asks for user input, and it doesn't seem to read or write to STDIN/STDOUT)

This should work without having to install any additional software on linux and mac

awk '/^>/ {if(x>0) close(outname); x++; outname=sprintf("_%d.fa",x); print > outname;next;} {if(x>0) print >> outname;}' *.fasta

for f in _*.fa
do
    sixpack -sequence $f -outfile $f.sixpack.out -outseq $f.sixpack.fa
done

If you want to have the output in a single file, use cat to combine them.

ADD COMMENT • link 5.5 years ago by Michael 55k

0

Entering edit mode

I think the last line of this answer might be what you want to do. Here's an example to concatenate files:

cat *.fasta > bigfasta.fasta

Also, I'm pretty sure the EMBOSS suite is in a public Galaxy server out there for easy access.

ADD REPLY • link 5.5 years ago by colindaven 7.0k

0

Entering edit mode

Hi,

Thanks, Michael.. I am trying to implement your suggestion (this awk command) with Python, but I am getting syntax error every time, Can you please help me in this. Below is code :

import subprocess

cmd = "awk '/^>/ {if(x>0) close(outname); x++; outname=sprintf("_%d.faa",x); print > outname;next;} {if(x>0) print >> outname;}' {}".format(FASTA_FILE_PATH)

subprocess.call(cmd, shell=True)

FASTA_FILE_PATH : full path of input fasta file

ADD REPLY • link 5.4 years ago by shiv ▴ 10

0

Entering edit mode

Why call awk from python? Anyway you have to adapt the quotes. E.g. escape them or use alternative quotes. Like qq in Perl. Don’t know if python has similar functionality

Of course it has something similar... https://stackoverflow.com/questions/29559905/does-python-have-an-equivalent-of-perls-qq

but simply not as powerful 💪

ADD REPLY • link 5.4 years ago by Michael 55k

0

Entering edit mode

Hi,

As I said that I have more than one fasta file so I just tried to write python script to get the path of input fasta file.. Now my problem has been solved.. Thank you so much for help !!

ADD REPLY • link 5.4 years ago by shiv ▴ 10