Question

SPAdes run with different results

0

Entering edit mode

5.0 years ago

lizabe ▴ 10

Hi! I am running SPAdes and the amount of contigs changes when I run it with the same k-mer number but at different times. For example if I run:

spades.py \
-1 $ READS1 \
-2 $ READS2 \
-s $ READS3 \
-o $ RESULTS \
-k 125 \
--careful \
--threads $ SLURM_CPUS_PER_TASK

I get 229 contigs

If I run

spades.py \
-1 $ READS1 \
-2 $ READS2 \
-s $ READS3 \
-o $ RESULTS \
-k 111,115,117,121,123,125 \
--careful \
--threads $ SLURM_CPUS_PER_TASK

I get 244 contigs in the final contigs.fasta file. When I count the contigs in the file final_contigs.fasta of each generated folders, I find that the number of contigs in the K125 folder is 244, so I deduce that the file contigs.fasta were obtained with the 125 k-mer.

Does anyone know why this is happening? Thank you!

spades • 1.3k views

ADD COMMENT • link 5.0 years ago by lizabe ▴ 10

0

Entering edit mode

In such case, would it not be expected that -k 111,115,117,121,123,125 obtained the best result, that is, less number of contigs than only -k125?

ADD REPLY • link 5.0 years ago by lizabe ▴ 10

1

Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. SUBMIT ANSWER is for new answers to original question.

ADD REPLY • link 5.0 years ago by GenoMax 152k

1

Entering edit mode

Smaller number of contigs doesn't necessarily mean a better assembly. As an imaginary example, wrongly joining two contigs (a mis-assembly) may decrease the number of contigs and increase N50, but - as it is incorrect - it decreases assembly quality.

There are a number of different metrics to evaluate an assembly, to get a good picture of assembly quality, it is recommended to use several of them.

ADD REPLY • link 5.0 years ago by h.mon 35k

score 2 · Answer 1 · 2020-07-30

2

Entering edit mode

5.0 years ago

h.mon 35k

SPAdes isn't running several kmer sizes to select the best assembly, rather, SPAdes incorporates information from all kmer sizes when building the graph. Which means, -k 111,115,117,121,123,125 is not the same as -k 125, and differences are to be expected.

ADD COMMENT • link 5.0 years ago by h.mon 35k