Question

Eukaryotic genome Assembly

0

Entering edit mode

4 months ago

manaswiniparija3 ▴ 50

I want to use MaSuRCA Genome Assembly to assemble a fish genome. As I have both small and large reads but for MaSuRCA: Avian/small plant genomes (up to 1Gb): 256Gb RAM, 32+ cores, 2Tb disk space is needed which I do not have. I was looking for free online platforms where I can use this kind of toolkit for hybrid assembly. I tried usegalaxy.org.au but it provided only 100 GB of usable space. I have around 90 GB of raw data. so unable to work with it. can someone suggest how I can use this tool kit or any alternative?

Genome-assembly MaSuRCA • 703 views

ADD COMMENT • link updated 4 months ago by Ram 44k • written 4 months ago by manaswiniparija3 ▴ 50

0

Entering edit mode

I'd love to be proven wrong, but common sense tells me that there is no free server with 256Gb RAM, 32+ cores, 2Tb disk space. My suggestion is to look for paid servers if funding is available. If not, it still may be better to partner with someone at an educational institution that has access to a computer with these specs.

ADD REPLY • link 4 months ago by Mensur Dlakic ★ 28k

0

Entering edit mode

ok thanks

ADD REPLY • link 4 months ago by manaswiniparija3 ▴ 50

0

Entering edit mode

I would just use the long reads to generate an assembly by themselves using a long-read assembler, then polish using the shirt reads if it is necessary. You can also try downsampling the raw data too in order to reduce your computational requirements.

ADD REPLY • link 4 months ago by samuel.a.odonnell ▴ 580

score 0 · Answer 1 · 2024-07-08

0

Entering edit mode

4 months ago

colindaven 7.0k

Like Samuel says, you don't need a hybrid assembly. Run either Flye or Shasta - both are easy, but go with Flye first since Shasta will likely need >256 GB RAM with this.

Then polish the long read assembly downstream using long reads with a toolset specific for your reads (have you got PAcbio or ONT ?).

The Hypo tool is very good for Illumina polishing.

If you ONT you can take the route of dorado correct followed by the hifiasm assembler.

In terms of hard disk space, you might be able to buy a 2 or 4 TB internal SSD (or maybe even external SSD) since you will definitely need it for these kind of assemblies.

ADD COMMENT • link 4 months ago by colindaven 7.0k

0

Entering edit mode

Thanks everyone for the reply. I have Oxford nanopore reads. but it has an N50 value of 2500 bp and most of the reads have quality scores less than 28. I assembled them with flye and raven then used links for scaffolding and gap2seq for gap filling using BGseq reads. but I found only 2 complete busco in them. That's why I thought to go through making super reads from BGseq reads(shot reads) and perform hybrid assembly.

ADD REPLY • link 4 months ago by manaswiniparija3 ▴ 50