Hi,
I have been using clustalo in the terminal to create multiple sequence alignments for the past few weeks, but every time I try lately it fails. I've tried on two computers, with different files. I've tried looking up the errors but haven't found anything. Does anyone know what this means, or what might be going on?
Thanks in advance!
Here's my command:
clustalo --in=fl_L1PA2.fa --out=L1PA2.aln --force --outfmt=clustal --wrap=175 --threads=6 --verbose
And here's the output. It goes along fine through the first few parts, and then..
Using 6 threads
Read 914 sequences (type: DNA) from fl_L1PA2.fa
Using 96 seeds (chosen with constant stride from length sorted seqs) for mBed (from a total of 914 sequences)
Calculating pairwise ktuple-distances...
Ktuple-distance calculation progress done. CPU time: 1035.86u 2.24s 00:17:18.09 Elapsed: 00:03:29
mBed created 31 cluster/s (with a minimum of 1 and a soft maximum of 100 sequences each)
Distance calculation within sub-clusters done. CPU time: 387.80u 0.50s 00:06:28.30 Elapsed: 00:01:18
Guide-tree computation (mBed) done.
HHalignWrapper:hhalign_wrapper.c:1419: problem in alignment (profile sizes: 1 + 1) (chr1_142876163-142882314 + chr1_223568877-223574899), forcing Viterbi
hh-error-code=4 (mac-ram=8000)
hhalign:hhalign.cpp:961: Problem Reading/Preparing profiles (len(q)=0/len(t)=0)
HHalignWrapper:hhalign_wrapper.c:1447: problem in alignment, Viterbi did not work
hh-error-code=4 (mac-ram=64000)
hhalign:hhalign.cpp:961: Problem Reading/Preparing profiles (len(q)=0/len(t)=0)
FATAL: could not perform alignment -- bailing out
We can see 914 sequences, but you didn't tell us how long they are - at least on average. It would appear that some of them are long - at least that is what forces Viterbi with proteins.
Do any of your sequences have a length of zero, meaning just the header line?
If this is truly a DNA sequence, it might be a good idea to specify
--seqtype=DNA
.