Are there any de novo genome assemblers that work with both Nanopore and Illumina reads?
SPAdes can take both Nanopore and Illumina reads, but it's only for prokaryotic genomes. I haven't seen anything for eukaryotic.
All the discussion and literature that I have seen so far suggests using Nanopore long reads for assembly and then polishing with Illumina short reads. However, you need a certain level of coverage for the assembly to complete (for example, Canu recommended minimum is 20X). What if you only have 1X coverage with long reads? That will not be enough to assemble on its own, but should be much better than short reads alone. What's the appropriate approach for that situation?
How large is your genome expected to be?
You could give SPAdes a try. As long as you are not in the "human" genome territory it may work. I recall one of the SPAdes developers writing that it could be used for larger (e.g. fungal genomes) but can't find that post/thread at the moment.
Edit: SPAdes manual refers to not using
--careful
option for "large or medium" eukaryotic genomes. So looks like you could certainly try it out.It should be around 500 MB, so it's not too big, but certainly closer to human than bacterial size.
Good point about the
--careful
option, but the manual also says "SPAdes is not intended for larger genomes (e.g. mammalian size genomes)", so I am not sure which part to believe.If you have some time (and I think you have the resources, if I recall from a 10x thread) go ahead and give it a try. At the most the job will fail :)
Good memory!
I certainly plan to give it a try. I just wanted to know if I am missing anything and to have some alternatives in case it fails.
can it be run with in 256 gb ram
Trinity, I think is the best option for nanopore reads in hydrid assembly.
Do you have a source for that? Because on github I find the following:
I should've specified it's genome assembly, not transcriptome. Trinity is for RNA-seq.
Oh, sorry that´s true, is for RNA-seq, what about IDBA_hybrid? You can use nanopore-reads as reference.
Hi there, I'm new to the subject but I will soon be facing the same interrogations. I found only SPAdes and ALLPATHS-LG for the moment that does that.
With a better coverage, what would be the best approach ? Using a pipeline to assemble de novo with Nanopore and Illumina data or assembling the genome with Nanopore data and then correct with Illumina data ? or even complete the draft genome from Illumina with Nanopore data ?
Thank you very much,
Nanopore still lacking performance, the ratio cost/performance remains high. I think that PacBio is the best option for long reads and to complete fragmented assemblies (from illumina). Where you from lagartija? I know your name :).
Actually I already have the reads by Nanopore so I can't change that. By the way, do you know what's the difference between Spades and Spades-Hybrid ? It seems that both can do hybrid assembly...
So you know my name ? You meed lagartija or my real name ? haha I'm from France. But I'm also Argentinian and Norwegian. And you ? Italian ?
No, is not the same, you can use 'trusted contigs' for de novo assemblies with spades, but not reads. On the other hand, spades hybrid can perform de novo assemblies from long and short reads :). I from the Congo but I live in America years ago, I know lagartijas XD.
AAAh I see. And how do I get the trusted contigs ? And both for Illumina and Nanopore ?
You can use old assemblies as trusted contigs (from the same specie and closely related), the use of not highly related genomes are not recomended (in spades), if you dont have access to old assemblies (or it does not exist) de novo and hybrid assemblie is the unique option, and yes, You can use reads from nanopore and illumina for hybrid assemblies with spades-hybrid.
Only if you have them from some other source (e.g. an illumina only assembly).
Because from what I see here Spades takes reads : http://spades.bioinf.spbau.ru/release3.10.1/manual.html