You can do this with Python, though I don't see a particularly compelling reason to over just running MAFFT at the commandline natively.
import sys
from Bio.Align.Applications import MafftCommandline
import tempfile
files = sys.argv[1:]
lines = []
for file in files:
with open(file, 'r') as ifh:
lines.append(ifh.read())
with tempfile.NamedTemporaryFile() as temp:
temp.write('\n'.join(lines))
temp.seek(0)
mafft_cline = MafftCommandline(input=temp.name)
stdout,stderr=mafft_cline()
print stdout
Run this as:
python mafft_alignments.py *.fas
on the commandline.
This uses a temporary file for MAFFT so that you can concatenate all the sequences from your input files, without having to worry about intermediate filehandles etc. This might be an issue if your files are very big though (depends how much memory you have).
python mafft_alignment.py A.fasta B.fasta
gives me:
>RandomSequence_bhKRyVJoNyY4GralWOtVXRs9NWgLuDzS
gctacggta-gttagtgacccaggg------ccgagggcttccccgaactaaacacaatt
atcataatttggtccactcccgtgttc
>RandomSequence_dVyOdIlHB1I29BLYaVvjVIInwXbxldXU
---agggcatcttagtgtaccgcgacactacctaaagggtcgcttattttttgcccggtt
gtgaacagtaggcgcattgttgg----
For the input data:
$ cat A.fasta
>RandomSequence_bhKRyVJoNyY4GralWOtVXRs9NWgLuDzS
GCTACGGTAGTTAGTGACCCAGGGCCGAGGGCTTCCCCGAACTAAACACAATTATCATAATTTGGTCCACTCCCGTGTTC
$ cat B.fasta
>RandomSequence_dVyOdIlHB1I29BLYaVvjVIInwXbxldXU
AGGGCATCTTAGTGTACCGCGACACTACCTAAAGGGTCGCTTATTTTTTGCCCGGTTGTGAACAGTAGGCGCATTGTTGG
Personally, I'd concatenate the file with cat
first, and then run mafft
directly from the binary at the commandline though...
Please could you format your code using the
101010
icon. Especially for python codeYou just want to invoke MAFFT on a bunch of sequences via python?
Why do you want to do this via python specifically?
i want to use python to allign alot of unaligned sequences like
1:-
2-
3-
What is your error ?
Please do your import at the begining of your script
from Bio.Align.Application.import mafftcommandline