Difficulty with Muscle Sequence Alignment
0
0
Entering edit mode
8.1 years ago
rc16955 ▴ 90

Hi all,

I am trying to perform an alignment of 11 very long (whole chromosome) DNA sequences using Muscle 3.8.31 for Linux, but am encountering a problem.

This is my input:

muscle-3.8.1 -diags -in S_ratti_Chr2.fa -out S_ratti_Chr2.afa

The code seems to run for a while, but then I get the following:

    S_ratti_Chr2 11 seqs, max length 16759905, avg  length 16759518
00:48:36  1053 MB(65%)  Iter   1    1.52%  K-mer dist pass 1
00:48:47  1053 MB(65%)  Iter   1  100.00%  K-mer dist pass 1
00:48:47  1053 MB(65%)  Iter   1    1.52%  K-mer dist pass 2
00:48:47  1053 MB(65%)  Iter   1  100.00%  K-mer dist pass 2
00:49:07  24533 MB(100%)  Iter   1   10.00%  Align node       
/cm/local/apps/torque/4.2.4.1/spool/mom_priv/jobs/4974615.master.cm.cluster.SC: line 15: 95045 Killed                  muscle-3.8.1 -diags -in S_ratti_Chr2.fa -out S_ratti_Chr2_2.afa

No output file is generated. I am very new to bioinformatics and this is the first time I have attempted to use Muscle (or any aligning method); could anyone help me understand why the command is failing? Lay terminology would be greatly appreciated as I am also not very familiar with coding. I'd also like to hear of any other aligning methods if there are any generally considered better than Muscle.

Many thanks in advance.

genome • 3.8k views
ADD COMMENT
0
Entering edit mode

See Muscle manual, it may be useful. There is no limit on sequence length,

http://www.drive5.com/muscle/muscle.html

I would suspect some atypical symbol in the sequence (different from ACGT).

Although it started running, but was killed on the first iteration...

Another alignment program, a younger one is Mafft. Below there is its web-site.

http://mafft.cbrc.jp/alignment/software/

ADD REPLY
0
Entering edit mode

Many thanks for your reply; I am not sure that the problem is one of an atypical character as it seems to stop at different places each time I run it, for example another attempt produced the report "line 15: 103937 Killed". Nevertheless, would you be able to recommend a way of identifying atypical characters in the fasta?

And thank you for linking to Mafft; I will look into it.

ADD REPLY
0
Entering edit mode

Are you running it locally on your computer or on a cluster ?

ADD REPLY
0
Entering edit mode

Looks like this is being run on a cluster.

First thing to check would be to see how much memory is being consumed. I suspect that you may be running out of it/hitting a quota.

ADD REPLY

Login before adding your answer.

Traffic: 1619 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6