I am running denovo whole genome assembly using Illumina NextSeq paired-end data of about 300 gb for a eukaryotic genome of approx 1.45 gb(genome size). My server configuration is 512gb RAM and 7.5 tb hard disc with 64 processor. My problem is that the assembly got stuck in the step 'generating adjency' from the past 30-35 days. I have started the assembly on 7th of May and its around 48 days since its running. Now I am little bit confused whether to abort the assembly or should I continue. Kindly guide me. I am not using MPI option since we don't have cluster in our machine. I would like to know is there any process or code to enhance the speed of assembly.
Hi @way2manmohan,
Wow... more than 30 days! Something is wrong for sure. (I have assembled human genomes with the MPI version, and the first stage typically takes about 24 hours.)
@cyril-cros may be right about running out of RAM (below). It is good advice to watch the process in
top
orhtop
. By "stuck", do you mean that you are no longer seeing any new messages in the log output? It would be helpful to post your full log output to a github gist and link to it here. Also, please enable the verbose option, if possible (addv=-v
to the abyss-pe command line).It is possible to use the MPI version even on a single machine, and it will speed things up greatly. When using the non-MPI version, you are only using 1 of 64 available cores. To use the MPI version you just need to have a recent OpenMPI library installed on your system (and also the OpenMPI development headers.) If you have a system administrator, they should be able to do that for you very easily.