This is cross-posted with Increasing kmer limit in SPAdes, but because I may never receive a reply from that post, which is > 1 year old, I am re-posting afresh.
1.. Is increase of k-mer > 127 possible with SPAdes?
2.. Is it NECESSARY to check k-mer values > 127 for SPAdes?
[Read length trimming, prior to SPAdes assembly, yielded trimmed length distribution of - 41- 150bp]
I've got 2*150bp (PE data) for fungal spores (haploid genome) where I want to play with higher k-mer values to assess any improvement in assembly contiguity and completeness. So sharing your experience would help me. Thank you!
With 150bp reads, it is very unlikely that you will benefit from K>127, which is convenient because you can't do that with Spades. You can extend reads by merging them or with Tadpole, though, which will often improve the assembly. If you download BBMap, there is a suggested procedure for this process in bbmap/pipelines/assemblyPipeline.sh.
Incidentally, Tadpole supports arbitrarily long values of K, but Spades will still generally give a substantially more contiguous assembly.
OK, good to know k>127 is not possible with SPAdes.
This is possibly premature question, I've gotta first read your
suggested procedure for this process in
bbmap/pipelines/assemblyPipeline.sh. :)
But is your recommendation to extend reads using Tadpole, and THEN using SPAdes on these results?
I'll take a look at BBTools 37.5, after the admins of my univ. HPCC install it. I am assuming this merge you talk of does not use bbmerge.sh, or else you might have suggested that instead....
No, the the assembler options are just there to give the usage syntax because I tend to forget it; you only need to use one of them. The choice of assembler kind of depends on the dataset.
As for BBMerge, yes, I am talking about that when I say "merge" :)
(3) After compile, edit "options_storage.py" file in /SPAdes-3.xx.x/share/spades/spades_pipeline as below.
(maybe around line 40 in the section #other constants)
OK, good to know k>127 is not possible with SPAdes.
This is possibly premature question, I've gotta first read your
But is your recommendation to extend reads using Tadpole, and THEN using SPAdes on these results?
I'll take a look at BBTools 37.5, after the admins of my univ. HPCC install it. I am assuming this merge you talk of does not use bbmerge.sh, or else you might have suggested that instead....
No, the the assembler options are just there to give the usage syntax because I tend to forget it; you only need to use one of them. The choice of assembler kind of depends on the dataset.
As for BBMerge, yes, I am talking about that when I say "merge" :)