Question

Has Anyone Successfully Used Allpaths-Lg On A >500Mb Genome With The Minimal Short+Kb Library Sets?

3

Entering edit mode

13.6 years ago

2184687-1231-83- ★ 5.1k

I was wondering if someone, other than David Jaffe himself, has any stats on the use of ALLPATHS-LG on a relatively big genome (>500MB) done with at least the minimal libraries recommended by Broad, the short one and the 3Kb or 5Kb library >40x coverage.

Is there any published data with that?

assembly illumina • 5.0k views

ADD COMMENT • link updated 10.3 years ago by Biostar 20 • written 13.6 years ago by 2184687-1231-83- ★ 5.1k

0

Entering edit mode

Recommended by whom? I think the required jump sizes would vary enormously depending on organism.

ADD REPLY • link 13.6 years ago by Ketil 4.1k

score 6 · Answer 1 · 2011-07-07

The broad talks about some genomes >500MB it assembled posted on its allpaths-lg blog. I personally haven't had luck with it yet b/c not all of the modules respect the MAX_MEM_GB (or something similar) argument. I have a 1TB system to work on, but it is shared, so it crashes at one of the modules that looks at how much memory is available and tries to use it all. If you are on your own dedicated system with enough ram then you should be ok. They claim to be fixing that particular issue now so people on shared systems can use the program.

That said the guys at broad have been very quick to answer my questions and offer help when I got stuck going through their documentation. Their support has been a very pleasant experience and they are eager to help get you going which I can't exactly say for BGI. Also in the genome assembly competition which had a ~100MB simulated genome their assembler was one of the best, and they didn't even have a large team people working on assembly QC and post processing like BGI did.

EDIT:

I used Allpaths LG to generate the preliminary assemblies in the Crocodilian genome announcement paper last year http://genomebiology.com/content/13/1/415. Using a combination overlap library, and a reasonable coverage 2kb insert library using a custom protocol that Nader Pourmand at UCSC developed, we were able to achieve a scaffold N50 of 106Kb, and a contig N50 of 28Kb. I have not been involved with the project recently, so I am not sure what the current state of the assemblies are, but Allpaths-LG did do a good job on that genome with the data we had available at the time.