How much resources does HiCanu need to assemble PacBio sequence
1
0
Entering edit mode
19 months ago
Ananthu • 0

Hi, I am planning to assemble a diploid genome using pacBio reads. We have 900GB of raw reads. The genome size is ~1 Gbps. What resources do we need to get this done? How much RAM and how many CPUs do we need. We have HPC nodes with >100 GB ram and 26 CPUs. Will this be sufficient? Thank you.

HiCanu Genome assembly • 1.1k views
ADD COMMENT
0
Entering edit mode

You may consider reaching out to the developers behind the program: https://github.com/marbl/canu

ADD REPLY
0
Entering edit mode
19 months ago
Dave Carlson ★ 2.0k

Since you have access to an HPC cluster, this should not be a problem. You can run Canu with useGrid=true, and Canu will parallelize the analysis across multiple nodes using your cluster's job scheduler.

You can additionally supply flags to control how many threads and how much memory is used per process to ensure that the CPU and memory limits of your nodes are not exceeded. See more here:

https://canu.readthedocs.io/en/latest/tutorial.html#execution-configuration

I've done assembly of a ~1 GB, highly repetitive genome using Canu on a cluster with similar specs to the one you're describing using this strategy.

ADD COMMENT

Login before adding your answer.

Traffic: 1687 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6