Question

Spades assembler output

1

Entering edit mode

9.5 years ago

elbecerrasoto ▴ 30

Hi!

Just I want to know what are the differences between the two outputs files, scaffolds.fasta and contigs.fasta in the assembler spades.py.

The manual don't provide any useful information at all. Please I don't want the definition of a conting or a scaffold, what I trying to understand is the technical part behind the spades algorithm, for example maybe spades.py is assuming that I am using any Illumina 1.9 sequencing machine but that's not enough information to determine the scaffolds or is it? How a scaffold is determined in the De Bruijn graph versus a conting? How spades.py knows the insert size of the specific protocol used to sequence, or just assumes one?

Help I am a little lost.

Any information will be useful.

Here is the manual's link http://spades.bioinf.spbau.ru/release3.6.2/manual.html#sec3.5

Assembly • 11k views

ADD COMMENT • link updated 3.0 years ago by Ram 45k • written 9.5 years ago by elbecerrasoto ▴ 30

1

Entering edit mode

Because scaffolds and contigs are terms used by most assemblers. A scaffold is a construct of multiple contigs bridged by poly N characters. What it means is that a region between contigs cannot be resolved by the assembler despite knowing the orientation of 2 contigs relative to each other.

ADD REPLY • link 9.5 years ago by apelin20 ▴ 490

0

Entering edit mode

can you explain the contigs orientation?
have an example please.
Thanks

ADD REPLY • link 9.5 years ago by midox ▴ 290

1

Entering edit mode

Try this http://genome.jgi.doe.gov/help/scaffolds.html

ADD REPLY • link 9.5 years ago by apelin20 ▴ 490

0

Entering edit mode

thanks.

how we know the distance between the two paired reads?

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 9.5 years ago by midox ▴ 290

1

Entering edit mode

SPAdes uses a k-bimer approach to estimate distances.

https://en.wikipedia.org/wiki/SPAdes_(software)

There is a range of papers describing various parts of the SPAdes algorithm (PMID: 22506599, PMID: 24931996, PMID: 26040456).

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 9.5 years ago by trausch ★ 2.0k

0

Entering edit mode

You can map your paired reads to a reference, like an assembled genome, and create a bam file. Then you can use CollectInsertSizeMetric from picard tools to estimate distance between two paired ends.

ADD REPLY • link 9.5 years ago by apelin20 ▴ 490