Hi. We are assembly a genome of chromobacterium spp. We have 2 fastq files, one Forward and other Reverse. The fastaq files are ilumina reads 300bp. What we need is to assembly these files into contigs and then generate scaffolds. The issue comes now:
We used CLC bio to generate contigs and there were generated 78 contigs. Next we needed to put these contigs into the right order ( the order they should be in the original genome ). Using a reference genome we tried this, but failed because CLC do not allows contigs longer than 99.000bp for this task and we had longer. So we come up using mauve , starting from the contigs generated by CLC and Mauve did the Job, but with undesirable gaps.
We read articles and found SPADES. Spades assembled the F and R fastq into contigs and scaffolds, generating 282 contigs and scaffolds ( against 78 of CLC ), because SPADES generated small contigs some lesser then 200 . so... My question is:
How can we generate contigs longer than 1000bp so that we end up with smaller number of contigs? we didn't see a parameter on SPADES to do this and CLC is out of question because we used a trial version and now... what options rest?
Note: the hard drive was formated and there are no longer contigs and we are starting from fastq again
Thank you since now for you spending time on it.
Thank you rtliu, I'll try it.
This is better suited as a comment to rtliu's answer than an answer by itself
EDIT: I've moved it there now.