Question

Open Source Sequence Assembly Programs

4

Entering edit mode

13.9 years ago

Monzoor ▴ 300

Can anyone suggest a simple open source light weight assembly tool which I can use on a simple desktop for assembling max of 20000 to 30000 sequences.

Note: I have myself simulated these sequences from 10 genomes. The idea was to create an artificial metagenomics data set). Consequently I do not have quality files for these sequences. Please do not suggest me to use FAMES data set, as I have already benchmarked my experiment using these data sets.

open-source-software metagenomics assembly • 8.6k views

ADD COMMENT • link updated 2.3 years ago by Ram 44k • written 13.9 years ago by Monzoor ▴ 300

score 5 · Answer 1 · 2010-12-20

5

Entering edit mode

13.9 years ago

Spitshine ▴ 660

CAP3 and PCAP (http://seq.cs.iastate.edu/) by Xiaoqiu Huang (http://www.cs.iastate.edu/~xqhuang/) are easy to use. We've been using CAP3 for assembly of simulated data for teaching.

ADD COMMENT • link 13.9 years ago by Spitshine ▴ 660

score 4 · Answer 2 · 2010-12-21

Shameless plug: MIRA3 at SourceForge with the the corresponding Wiki where you can find the full manual online. It is one self-contained binary, ready to run if you download the binary package.

Input formats can be any of the usual suspects (FASTA, FASTQ, PHD, EXP), giving ancillary data is also possible (NCBI TRACEINFO XML, EXP, tab delimited files).

You might be interested in using the feature where you can assign strain data to your reads and have the assembler mark SNPs and other differences when assembling all reads together. This works both for mapping as well as for de-novo assemblies. See some examples here.

You can also tell the assembler which sequencing technologiy your (in this case simulated) sequences are and see how it influences assembly and SNP calling (e.g., indels will not be called SNPs in 454 sequences by default, but in Sanger and Illumina sequences they will).

On a last note: as you have no qualities, you will need to tell the assembler about it. See here for information on how to do that.

score 3 · Answer 3 · 2010-12-20

3

Entering edit mode

13.9 years ago

Stephan ▴ 150

I know a online tool, can take a couple of hours. And don't know your fileformat but have a look: http://egassembler.hgc.jp/

http://www.phrap.org/phredphrapconsed.html

http://sourceforge.net/apps/mediawiki/amos/index.php?title=AMOS

ADD COMMENT • link 13.9 years ago by Stephan ▴ 150

score 2 · Answer 4 · 2010-12-20

2

Entering edit mode

13.9 years ago

Rm 8.3k

IDBA de-novo assembly

ADD COMMENT • link 13.9 years ago by Rm 8.3k

0

Entering edit mode

Thanks Raghu. The tool looks interesting. I will try it out and let you know.

ADD REPLY • link 13.9 years ago by Monzoor ▴ 300