Increasing kmer limit in SPAdes

2

Entering edit mode

8.7 years ago

rrcutler ▴ 120

Hello all. I have been assembling a bacterial genome with a number of different assemblers. When looking for the optimum kmer value, I have determined it to be around 190-200 using kmer optimizers based on the velvet assembler. Now when I run the assembly in SPAdes, I get the best assembly at it's maximum kmer value of 127. However, from the evidence I have from the other kmer optimizer, I want to try a SPAdes assembly using kmer = 199.

Is there a way how to increase the limit of kmer size in SPAdes? I know there is a way in velvet.

Thanks

Assembly Bacterial Spades Velvet • 5.6k views

ADD COMMENT • link 8.7 years ago by rrcutler ▴ 120

2

Entering edit mode

I doubt that you will get any better assemblies in praxis if you could extended kmer size beyond 127. What kind of reads do you have?

ADD REPLY • link 8.7 years ago by piet ★ 1.9k

1

Entering edit mode

Illumina 1.9 Sequenced on a miseq. Read length = 250 bp

ADD REPLY • link 8.7 years ago by rrcutler ▴ 120

0

Entering edit mode

Do you have 50x+ coverage in your data as suggested by SPAdes folks here?

ADD REPLY • link 8.7 years ago by GenoMax 150k

0

Entering edit mode

Yes, when mapping the raw reads to contigs from an assembly and analyzing with Qualimap, the Coverage = 465 STDdev = 263

ADD REPLY • link 8.7 years ago by rrcutler ▴ 120

1

Entering edit mode

genomax2 is referring to kmer coverage.What you are reporting is read coverage. Please look into the output of spades and find out what "average coverage" spades reports for every kmer size. You will find the kmer coverage of the last (longest kmer size) also in the header lines of the FASTA file comprising the contigs. Take the coverage of the largest contig which will most likely represent a part of the chromosome you want to sequence.

ADD REPLY • link 8.7 years ago by piet ★ 1.9k

0

Entering edit mode

Thanks for the clarification. Looking at kmer coverage I have a kmer coverage of 166

ADD REPLY • link 8.7 years ago by rrcutler ▴ 120

1

Entering edit mode

Can you clarify "best" part? As in still decreasing number of contigs?

ADD REPLY • link 8.7 years ago by GenoMax 150k

0

Entering edit mode

Yes and also N50 value. Do you recommend any other metrics to determine "best"?

ADD REPLY • link 8.7 years ago by rrcutler ▴ 120

0

Entering edit mode

Can you tell read length?

ADD REPLY • link 8.7 years ago by Bioinformatics_NewComer ▴ 330

1

Entering edit mode

Read length = 250 bp

ADD REPLY • link 8.7 years ago by rrcutler ▴ 120

0

Entering edit mode

Did you ever figure out how to bump up k-mer > 127 default value? Your post reminded of choice of k-mer size for metagenomic assembly

I've got 2*150bp (PE data) where I want to play with higher k-mer values to assess any improvement in assembly contiguity and completeness. So sharing your experience would help me. Thanks!

ADD REPLY • link 7.7 years ago by Anand Rao ▴ 640

Login before adding your answer.