Hello,
I am trying to perform a blastp search of 23 sequences against a BLAST database containing approximately 15,000 sequences (in a for loop that will give me seperate outputs). As a newcomer to HPC systems, I am unsure how many CPUs or how much memory I will need for this task. Can you advise me on how to assign these parameters correctly? Is there a way to determine the optimal values or will my intuition improve with experience?
If your HPC uses a job scheduler then submit the jobs via that mode. You don't say what is the size of the database and query but assuming they are not like
nt/nr
you may be able to get away with 8 cores and 30G of RAM. You may simply need to run a few jobs and try things out.I don't know if there is a job scheduler, I'll look at it. I don't exactly remember but my database was ~30 megabytes. Since it's a small one with 4 cores and 32 GB of ram handled my job in less than 2 min. I gave query sequences one by one with a for loop and get separate outputs. I didn't have the chance to try this with databases of different sizes, but I'll try different scenarios and will learn from them. Thank you for your answer!