Entering edit mode
8.3 years ago
rrcutler
▴
120
Hello all. I have been assembling a bacterial genome with a number of different assemblers. When looking for the optimum kmer value, I have determined it to be around 190-200 using kmer optimizers based on the velvet assembler. Now when I run the assembly in SPAdes, I get the best assembly at it's maximum kmer value of 127. However, from the evidence I have from the other kmer optimizer, I want to try a SPAdes assembly using kmer = 199.
Is there a way how to increase the limit of kmer size in SPAdes? I know there is a way in velvet.
Thanks
I doubt that you will get any better assemblies in praxis if you could extended kmer size beyond 127. What kind of reads do you have?
Illumina 1.9 Sequenced on a miseq. Read length = 250 bp
Do you have 50x+ coverage in your data as suggested by SPAdes folks here?
Yes, when mapping the raw reads to contigs from an assembly and analyzing with Qualimap, the Coverage = 465 STDdev = 263
genomax2 is referring to kmer coverage.What you are reporting is read coverage. Please look into the output of spades and find out what "average coverage" spades reports for every kmer size. You will find the kmer coverage of the last (longest kmer size) also in the header lines of the FASTA file comprising the contigs. Take the coverage of the largest contig which will most likely represent a part of the chromosome you want to sequence.
Thanks for the clarification. Looking at kmer coverage I have a kmer coverage of 166
Can you clarify "best" part? As in still decreasing number of contigs?
Yes and also N50 value. Do you recommend any other metrics to determine "best"?
Can you tell read length?
Read length = 250 bp
Did you ever figure out how to bump up k-mer > 127 default value? Your post reminded of choice of k-mer size for metagenomic assembly
I've got 2*150bp (PE data) where I want to play with higher k-mer values to assess any improvement in assembly contiguity and completeness. So sharing your experience would help me. Thanks!