Question

Cap3 Integration Of Velvet And Newbler Assemblies

4

Entering edit mode

12.0 years ago

tommivat ▴ 250

I am conducting de novo assembly of ~33Mb genome using 454 and Illumina reads. I cannot use MIRA, since I have ~80M Illumina reads (would require ~160Gb memory). So far I have found that it's usually most efficient to first assemble reads with Newbler and Velvet, respectively, and then combine the results using some third assembly program. I have been using CAP3 for the last step but I'm not satisfied with the results.

Statistics for the intermediate and final assemblies can be seen below. The problem is that CAP3 results are worse compared to the intermediate ones. It seems that CAP3 throws most of the contigs away. Two questions:

Should I use some specific options for CAP3 when conducting the final assembly
Are there any ready-made pipeline for doing this kind of 'integration' more effectively?

Statistics for the CAP3 output:

Number of contigs        826
Total size of contigs    5220088
Longest contig      37928
Mean contig size       6320
Median contig size       3734
N50 contig length      12593
L50 contig count        130

Statistics for Newbler output:

Number of contigs       1942
Total size of contigs   32110351
Longest contig     170575
Mean contig size      16535
Median contig size       8447
N50 contig length      37018
L50 contig count        272

Statistics for Velvet output:

Number of contigs       4939
Total size of contigs   34602711
Longest contig     134827
Mean contig size       7006
Median contig size       3463
N50 contig length      15446
L50 contig count        662

velvet assembly • 4.5k views

ADD COMMENT • link updated 12.0 years ago by lexnederbragt ★ 1.3k • written 12.0 years ago by tommivat ▴ 250

score 1 · Answer 1 · 2012-11-21

1

Entering edit mode

12.0 years ago

avik ▴ 60

try Minimus2 , although it may not improve assembly statistics drastically

ADD COMMENT • link 12.0 years ago by avik ▴ 60

score 1 · Answer 2 · 2012-11-21

1

Entering edit mode

12.0 years ago

SES 8.6k

In addition to Minimus2, you may want to try Zorro, which is based on the same pipeline and uses MUMmer. I think CAP3 was designed for EST assembly and I have doubts about what it is doing with genomic contig assembly.

ADD COMMENT • link 12.0 years ago by SES 8.6k

score 1 · Answer 3 · 2012-11-23

The newbler program from 454 can take in both 454 reads and illumina reads - have you tried that? See http://contig.wordpress.com/2011/01/21/newbler-input-ii-sequencing-reads-from-other-platforms/ (and maybe http://contig.wordpress.com/2011/09/01/newbler-input-iii-a-quick-fix-for-the-new-illumina-fastq-header/)