Let's get our terminology right: A core is just a fancy name for processors residing on the same chip. A process is the running copy of a program and a thread is a subprocess sharing (some) memory. With hyper-threading a single core can execute multiple threads simultaneously; these are then called logical processors. So multi-core and multi-processors is a hardware feature, whereas multi-threading is a software feature.
Btw, if you think you CPU has 8 cores, it most likely has just four plus hyper-threading.
I know that selecting the same number of cores that an application can support will get the application running at its maximum performance.
This is an oversimplification, but should be true in general. Good programs will detect the number of available (logical) processors and use that many threads.
Then how about the multi-processes supporting applications?
"multi-processes" is something very different from "multi-threading" and is much harder to pull off. Supporting "multi-processors" is done by "multi-threading". It's confusing, I know. ☺
Specifically, if an application supports multi-processes (from the documentation there is no information about whether or not it supports multi-threads, and no information about the upper limit of multi-processes), and the machine has 64GB memory and 16 cores (…), then how to set the option -p processes to get the application running at its maximum performance with available resources?
My guess: the documentation does not use the terms I defined above. The simplest thing to do here is to open up your system monitor. Start the program with prog -p 16
and see if all your CPUs are being used (after it has done reading in files). If your tool simply splits into 16 threads it should show up as one process using 100% of your CPU. If you can see the number of threads, that should be 16.
If somehow -p 16
does not give you the desired result, you should strive for the smallest amount of threads that uses 100% of your CPU.
I'm sorry, I don't mean to be rude, truly, but unless you know if your application supports multicore processors and shared memory, this question seems unanswerable.
Thanks Alex. From the application's documentation, there is no multi-threads option. I'd better check with the developer. An application supporting multi-processes does not necessarily support multi-cores, does it?
And for the multi-processes option, will the application be quicker with -p 3 than with -p 1 (with the same other arguments)?
Thank you.
If whatever analysis you are doing allows for it you could always
brute force parallelize
e.g. by splitting your input fastq files into multiple pieces and starting multiple since core jobs. Result files can then be combined into one (e.g. with alignments).Thanks for all your reply. I understand the concepts thread, processor and process now.
The application I am using is working on an BAM file, doing a series of analysis on each site. These series of analysis on each site are performed by 3rd party tools (samtools, bedtools ....) plus functions written by the developer. I guess the -p process (split into multiple processes) for the application is different from what we are talking. Does it mean that different 3rd party tools can be run on different sites simultaneously? Specifically, samtools is working on one site by a processor, and bedtools working on another site by another processor simultaneously. If this is the case, I cannot see advantages over just one process. One process here, I guess, means that the same tool is working on different sites by different processors at the same time.
Sorry if the above seems confused.
Another question is why the CPU usage for each copy of the application can be above 50%? (I ran two copies of the application). The sum of CPU usage is above 100%.
PID USER %CPU %MEM COMMAND
14004 root 95.3 3.6 python
21521 root 94.7 3.6 python
Usage of a core is reported independently so multiple cores can add up to several hundred % (e.g. 8 cores would be 800% when fully used). For example
So Does it mean that both commands each use 4 CPUs or what?
Yes. That is correct.
jing.mengrabbit : Please use
ADD REPLY/ADD COMMENT
to respond to existing posts to keep the threads logically organized.Thanks for your advice. I will follow it.