Hi,
I have encountered a problem, while running Muscle through the MSA package in R (via Rscript). I get this:
*** ERROR *** MSA::SetIdCount: cannot increase count
Fatal error, exception caught.
Error in msaFun(inputSeqs = inputSeqs, cluster = cluster, gapOpening = gapOpening, :
MUSCLE finished by an unknown reason
And I couldn't find what it meant. Is it that I have too many sequences..? If anyone has an idea of how to solve this, it would be great.
Thanks!
What is your input data like? You haven't told us how many sequences you have so we can't tell you if its too many.
Have you tried running it on a subset of your data?
It would also be worth trying to run the data through MUSCLE directly, without the R wrapper.
Thanks for your answer! I am running it on 10,000-20,000 sequences of around 1,500bp, 96 times. When running individually a case that caused a problem in the loop, there was no issue. I think the issue was what is described in the section "Known issues" of the msa package (here: https://bioconductor.org/packages/release/bioc/vignettes/msa/inst/doc/msa.pdf ), which is that there can be memory leaks with Muscle? Anyway, I used ClustalOmega instead and it worked very well.
Thanks again!
Are your sequence headers unique? Are you using a profile? Could you provide the command that you're using?
This is the part of the code that gives you the error:
If you search
MSA::SetIdCount
in the link below, you can find the codes where this function is used, and it might give you some ideas of the reason behind the error:https://git.wur.nl/haars001/reas/-/tree/master/muscle3.6_src
Hi, thanks for your reply!
Yes I had found this, but was not sure of what it meant (what is m_uIdCount ?). Yes my headers are unique (but I wasn't sure there was a problem if not, so thanks for this) and I don't know if I am using a profile. As I said, I have found a solution, which is using clustalomega instead of muscle. I agree it does not solve the issue but at least it helps.
Thanks again!
hi, did you manage to resolve this? I am having the exact error. I created an alignment with 2067 sequences using the "dna = msa(dna, method = "Muscle", order="aligned")", and the input file was a "DNAStringSet". I then moved to do the second alignment which had 174 sequences and it also ran okay. But when I moved to the third alignment with 2644 sequences, I received the error "* ERROR * MSA::SetIdCount: cannot increase count". So on smaller alignments it seems to be running okay and only get the error when i move to bigger datasets
I have changed the method and used "dna = msa(dna, method = "ClustalOmega", order="aligned")" instead and it worked okay. I think there is a bug in the "muscle" method
Hi (quite late), sorry I had completely forgotten this. I did the same, there seems to be some issue with the muscle method (I haven't checked if it is still the case, though). Thanks for sharing your solution here!
I have this problem with 7 sequences.
all sequences have a width of 361 nt.
Those sequences are also not the same:
As previously noticed, it is working using "method="ClustalOmega" (using Gonnet ??)