Hello everyone,
I have ~10 fasta files and each file contain more than 50 sequences and I want to get the information about 6-frame translation and ORFs for each sequences, I found Emboss-sixpack to do this work for me but when I went through the manual I got to know that it takes only single sequence as input file. Can you please suggest me with other options of this (may be I've missed) and is there any alternate option to do this thing without splitting the files ..
Thanks in advance
I think the last line of this answer might be what you want to do. Here's an example to concatenate files:
cat *.fasta > bigfasta.fasta
Also, I'm pretty sure the EMBOSS suite is in a public Galaxy server out there for easy access.
Hi,
Thanks, Michael.. I am trying to implement your suggestion (this awk command) with Python, but I am getting syntax error every time, Can you please help me in this. Below is code :
import subprocess
cmd = "awk '/^>/ {if(x>0) close(outname); x++; outname=sprintf("_%d.faa",x); print > outname;next;} {if(x>0) print >> outname;}' {}".format(FASTA_FILE_PATH)
subprocess.call(cmd, shell=True)
Why call awk from python? Anyway you have to adapt the quotes. E.g. escape them or use alternative quotes. Like qq in Perl. Don’t know if python has similar functionality
Of course it has something similar... https://stackoverflow.com/questions/29559905/does-python-have-an-equivalent-of-perls-qq
but simply not as powerful 💪
Hi,
As I said that I have more than one fasta file so I just tried to write python script to get the path of input fasta file.. Now my problem has been solved.. Thank you so much for help !!