I have a fasta file consisting of 5 concatenated E.coli genomes. I then simulate paired end reads with wgsim aiming for 25x coverage. Hence I get about 690K+ reads in total. Now, when I run abyss-pe
(I am using ABySS 2.0.2) on this dataset as
abyss-pe name=test k=50 in='seq.read1.fq seq.read2.fq'
My run fails after about 7 minutes with the following error message:
abyss-scaffold: scaffold.cc:504: bool isOverlap(const DistanceEst&): Assertion 'd.distance < 0' failed.
I have ran on smaller read datasets, up to 625K reads, and ABySS seemed to handle those perfectly fine. I have also tried changing k-mer size to 75, but I get exactly the same error. Has anyone seen this before or has any ideas how to go around this issue?
Is there any specific reason for using ~3 years old version of the tool? Could you try the same assembly with the latest version of the tool?
I have upgraded to ABySS 2.2.3 and rerun the scenario. The original error went away, but the run still fails, with the following message.
error: no edge 7515- -> 40850+
warning: the head of 40850+ does not match the tail of the previous contig
Although the issues with running in the default mode persisted, I was able to successfully generate unitigs by using
abyss-pe name=test k=50 in='seq.read1.fq seq.read2.fq' unitigs
Since this was the original goal for which I was using ABySS, I am closing this question.
Hello nsapoval!
We believe that this post does not fit the main topic of this site.
Resolved the issue as far as I was concerned. The original issue still remains open.
For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.
If you disagree please tell us why in a reply below, we'll be happy to talk about it.
Cheers!
Please do not close a post after it has received a response. If a given answer resolves the question then please accept it so others can get an indication on how to resolve this.