Dear all,
I would like to use biopython to align some DNA sequences. I got a script from internet and I ran it as follows (~ is the short for my home path):
~$ python
Python 3.5.1 (default, Jul 3 2016, 12:57:35)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio.Align.Applications import MuscleCommandline
>>> muscle_exe = r"/usr/bin/muscle"
>>> in_file = r"~/SpiderOak Hive/LAB/Lab book/Ery/Seqs/tester.fasta"
>>> out_file = "~/SpiderOak Hive/LAB/Lab book/Ery/Seqs/tester_aligned.fasta"
>>> muscle_cline = MuscleCommandline(muscle_exe, input=in_file, out=out_file)
>>> print(muscle_cline)
/usr/bin/muscle -in "~/SpiderOak Hive/LAB/Lab
book/Ery/Seqs/tester.fasta" -out "~/SpiderOak Hive/LAB/Lab
book/Ery/Seqs/tester_aligned.fasta"
but when lauching the application I got:
>>> muscle_cline()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/python3.5/lib/python3.5/site-packages/Bio/Application/__init__.py",
line 516, in __call__
stdout_str, stderr_str)
Bio.Application.ApplicationError: Non-zero return code 137 from
'/usr/bin/muscle -in "/home/gigiux/SpiderOak Hive/LAB/Lab
book/Ery/Seqs/tester.fasta" -out "/home/gigiux/SpiderOak Hive/LAB/Lab
book/Ery/Seqs/tester_aligned.fasta"', message 'MUSCLE v3.8.31 by
Robert C. Edgar'
>>>
when trying muscle directly from terminal, without any other applications running, I got:
~/$ muscle -in ery_multiseq.fasta -out ery_multiseq_aligned.fa
MUSCLE v3.8.31 by Robert C. Edgar
http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.
ery_multiseq 10 seqs, max length 1876490, avg length 1276570
00:00:42 98 MB(-7%) Iter 1 100.00% K-mer dist pass 1
00:00:42 98 MB(-7%) Iter 1 100.00% K-mer dist pass 2
Killed43 608 MB(-41%) Iter 1 11.11% Align node
What would be the issue? I have seen on internet that it might be due to memory problems; in that case is the code OK? and how could I run large alignments?
Code 137 is most likely a memory problem; the file I am using is 13 Mb overall and I have 15 Gb of RAM.
If I don't have enough memory, how can I extend it?
If it possible that what I consider a small genomic work could consume so much memory?
Is there a more efficient aligner than muscle? MAFTT for instance?
Many thanks,
Luigi