I am trying to make a hybrid assembly using PacBio reads and Nanopore reads. I was trying to use Canu but I did not find too much information about the pipeline to follow. Could you suggest me another approach?. Also it you have information about the whole pipeline of Canu for hybrid assembly it will help me a lot!!
Why have you added the short-reads tag? Neither technology fits that.
Here is the docs from Canu on how to use multiple files. Though if you have PacBio HiFi reads, it does say it's not supported to mix these yet.
To my knowledge, most hybrid assemblers (e.g., SPAdes and MaSuRCA), assume it's a mix of long and short read data, so I'm unsure what you'll be able to gain out of it.
I'm curious about your approach though. Why are you mixing 2 long read technologies?
Yes, It could be quite confused but my nanopore data is actually short reads, that is not from genome extraction it is from Genome-Wide Libraries. I run the command you sent from Assembling With Multiple Technologies in Canu but I was confused about the output because at the end I did not get any assembly file.
If you have sufficient coverage of Pacbio HiFi reads - I think 30- 50X - then just use the Hifiasm https://github.com/chhylp123/hifiasm. It is a generally accepted assembler for HiFi.
I don't know how you can use the nanopore reads. Normally the advantage of ONT is being very long, but this is not true in your case it seems. We can get excellent nanopore + illumina assemblies with Q20+ (R10.4.1) nanopore.
Illumina is normally used after ONT for polishing base level errors - mismatches and small indels.
I would not use Canu as it is quite old and slow now. If you don't like hifiasm you can try Flye and Shasta - but if you want more info you'll have to tell us your genome size.
Hi
Thanks for your answer. I think I have very good coverage. My genome size is 48M.
I also have data from poreC and illumina, the last one from RNAseq. I saw hifiasm has an option for poreC. In your experience, could it be a good approach?
48m is tiny, fungal size maybe. HiFi alone should be completely fine. You may not need PoreC on top of that. Start simple and integrate further evidence like PoreC as you go.
RNA-seq Illumina will be useful at the annotation stage but not for genome assembly.
Why have you added the
short-reads
tag? Neither technology fits that.Here is the docs from Canu on how to use multiple files. Though if you have PacBio HiFi reads, it does say it's not supported to mix these yet.
To my knowledge, most hybrid assemblers (e.g., SPAdes and MaSuRCA), assume it's a mix of long and short read data, so I'm unsure what you'll be able to gain out of it.
I'm curious about your approach though. Why are you mixing 2 long read technologies?
Hi
Yes, It could be quite confused but my nanopore data is actually short reads, that is not from genome extraction it is from Genome-Wide Libraries. I run the command you sent from Assembling With Multiple Technologies in Canu but I was confused about the output because at the end I did not get any assembly file.