Hi all,
I am trying to perform an alignment of 11 very long (whole chromosome) DNA sequences using Muscle 3.8.31 for Linux, but am encountering a problem.
This is my input:
muscle-3.8.1 -diags -in S_ratti_Chr2.fa -out S_ratti_Chr2.afa
The code seems to run for a while, but then I get the following:
S_ratti_Chr2 11 seqs, max length 16759905, avg length 16759518
00:48:36 1053 MB(65%) Iter 1 1.52% K-mer dist pass 1
00:48:47 1053 MB(65%) Iter 1 100.00% K-mer dist pass 1
00:48:47 1053 MB(65%) Iter 1 1.52% K-mer dist pass 2
00:48:47 1053 MB(65%) Iter 1 100.00% K-mer dist pass 2
00:49:07 24533 MB(100%) Iter 1 10.00% Align node
/cm/local/apps/torque/4.2.4.1/spool/mom_priv/jobs/4974615.master.cm.cluster.SC: line 15: 95045 Killed muscle-3.8.1 -diags -in S_ratti_Chr2.fa -out S_ratti_Chr2_2.afa
No output file is generated. I am very new to bioinformatics and this is the first time I have attempted to use Muscle (or any aligning method); could anyone help me understand why the command is failing? Lay terminology would be greatly appreciated as I am also not very familiar with coding. I'd also like to hear of any other aligning methods if there are any generally considered better than Muscle.
Many thanks in advance.
See Muscle manual, it may be useful. There is no limit on sequence length,
http://www.drive5.com/muscle/muscle.html
I would suspect some atypical symbol in the sequence (different from ACGT).
Although it started running, but was killed on the first iteration...
Another alignment program, a younger one is Mafft. Below there is its web-site.
http://mafft.cbrc.jp/alignment/software/
Many thanks for your reply; I am not sure that the problem is one of an atypical character as it seems to stop at different places each time I run it, for example another attempt produced the report "line 15: 103937 Killed". Nevertheless, would you be able to recommend a way of identifying atypical characters in the fasta?
And thank you for linking to Mafft; I will look into it.
Are you running it locally on your computer or on a cluster ?
Looks like this is being run on a cluster.
First thing to check would be to see how much memory is being consumed. I suspect that you may be running out of it/hitting a quota.