How to estimate the memory requirements for Diamond given the database and query protein sizes?
0
0
Entering edit mode
24 months ago
O.rka ▴ 740

On our new servers we have to request the amount of memory and time needed for a job. We are charged per thread per memory requirement for the time taken to complete the job (not the time requested). Anyways, I'm trying to minimize costs for a larger job.

I have a database that is 68G and 48170345 protein sequences (11GB gzipped, ~19GB uncompressed).

I can either do the following:

  1. Run Diamond against all of the proteins at once (I feel like this would be the most expensive)
  2. Split 100 files and run separately (each one is about 189MB)

Which method would use less resources?

How can I estimate how many resources would be required per job?

alignment diamond memory requirements • 393 views
ADD COMMENT

Login before adding your answer.

Traffic: 2229 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6