I used TopHat (v. 2.1.1) to align RNA-Seq reads of zebrafish to a genome (Danio_rerio.GRCz10.pep.abinitio.fa) on a system with the following specifications:
System specifications:
Memory: 4GB RAM
Processor: Intel® Core™ i5-4590 CPU @ 3.30GHz × 4
OS: Ubuntu 16.04 LTS (64 bits)
Hardware: 64 bits Architecture
TopHat Command:
time tophat --solexa-quals -g 2 -p 1 --no-coverage-search -j annotation/Danio_rerio.Zv9.66.spliceSites -o tophatoutput/ZV9 genome/zebrafish data/SRR630464_1.fastq data/SRR630464_2.fastq
TopHat Command Parameters
TopHat command parameters that we used are listed below:
-g Maximum number of multi hits allowed. Short reads are likely to map to more than one location in the genome even though these reads can have originated from only one of these regions. In RNA-Seq we allow for a restricted number of multi hits, and in this case we ask Tophat to report only reads that map at most onto 2 different loci.
-p Use these many threads to align reads
--library-type Before performing any type of RNA-Seq analysis you need to know a few things about the library preparation. Was it done using a strand-specific protocol or not? If yes, which strand? In our data the protocol was NOT strand specific.
--no-coverage-search To reduce the time it takes and to reduce the memory.
-J Improve spliced alignment by providing Tophat with annotated splice junctions. Pre-existing genome annotation is an advantage when analyzing RNA-Seq data. This file contains the coordinates of annotated splice junctions from Ensemble. These are stored under the sub-directory annotation in a file called ZV9.spliceSites
.
-O This specifies in which subdirectory Tophat should save the output files.
My question is by increasing number of threads means using multi-threading the time it takes to align the reads to genome will be reduced. while here as shown in the table given below by increasing number of threads, due to increasing number of threads the alignment time also increases instead of decreasing. only on two threads the alignment time decreases while on 4,8 and 16 threads the alignment time increases why?
Table
Files: FASTQ Files of Zebrafish accession number GSE42846. SRR630464_1.fastq/ SRR630464_2.fastq
Size: 4.2 GBs each file
Nucleotide Sequences: 24410561 Sequences have each file
1 Thread : User time is 137 min 0 sec and system time is 5 min 12 sec* and **total time is 142 min 12 sec
2 Threads: User time is 120 min 33 sec and system time is 6 min 40 sec and total time is 127 min 13 sec
4 Threads: User time is 124 min 20 sec and system time is 19 min 54 sec and total time is 144 min 14 sec
8 Threads: User time is 122 min 41 sec and system time is 18 min 31 sec and total time is 141 min 12 sec
16 Threads: User time is 122 min 22 sec and system time is 21 min 27 sec and total time is 143 min 49 sec
Total time = User time + System Time
We used 1,2,4,8 and 16 threads to align reads.
Kindly help me out to find a valid reason.
Thanks in advance.
After the actual alignment using Bowtie2 (which does scale well with multiple threads), Tophat spends a huge amount of time in a singlethreaded phase that may dominates the overall time, depending on the data. The alternative tools WouterDeCoster pointed out generally don't do this.
Thank you for your reply. Would you kindly explain singlethreaded phase?
So you have a quad core processor, do I understand that correctly? If you use more threads than physical processors I can understand it's not that efficient.
I addition you should know that the old 'Tuxedo' pipeline of Tophat and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using kallisto or salmon.
Thank you for your reply and time.
Yes sir i have a quad core system and it shows best performance on two threads and worse performance on increasing the threads i.e. is 4, 8 and 16 threads. I wants to find the exact problem why this version of TopHat shows best performance on two threads and worse performance on 4, 8 and 16 threads.
You only have 4 cores, and obviously, you also have background processes running. So at best 3 cores are fully free. Maybe you could add the time of 3 cores also to your analysis.
This time is the CPU time: The time used in executing this process only. Other processes time are not counted in this time. I have 4 cores its also expected that it will also show best performance on 4 threads, while it shows worse performance than 2 threads.
I'm not an expert at server architecture/sysadmin stuff, so if I'm wrong I hope someone will correct me.
While you are running TopHat using 4 cores, your computer still has background processes running. Have a look in
htop
: many things are going on in the background. So therefore there aren't 4 cores free for TopHat.Before tophat command i written a time command. this time command shows me Real time, User time and System time. user + system time is the pure CPU time. The time CPU spends on this process only. I already mentioned these time in the table above.
Actually it shows best performance on two threads instead of 4 threads. Normally its expected that it will show best performance on 4 threads because there are 4 cores. It shows best on 2 threads and worse on 4 threads. i runed 4 different datasets for all datasets it shows the same scenario, best performance on 2 threads and worse on all other threads i.e. 4 , 8 and 16 threads.
If you have 4 cores, that means you can utilize the CPU to 400%. If you have 4 threads, that means you can run 4 processes simultaneously (roughly) each at 100% CPU load. If you run 8+ threads on a 4 core / 4 thread processor, that means that your still running 4 threads at 100% at a time, but that your switching back and forth between two distinct threads on each cpu core. This adds overhead because it takes time and resources to switch between processes. You can't just throw some arbitrary number of threads at a job and think that means it gets done quicker. That's not how cpus work. Plus you have to consider memory and i/o bottlenecks as well. In particular 4 processes is not only going to use more CPU, but also more memory, cache, and I/O. If your system doesn't scale out well either for multi-threading or parallel processing, then you're beating a dead horse.
Thank you so much
It means my results are correct and there is a problem of synchronization and communication between multiple threads. Due to less synchronization and communication time between two threads, it shows good performance on two threads. Synchronization and communication time time increases as we increasing threads, that's why alignment time increases on 4 ,8 and 16 threads. Am i correct?
Not entirely. You missed following points:
I get the feeling you are not even reading what we write.
Lol, you only have an finite number of threads, dude.
http://ark.intel.com/products/80815/Intel-Core-i5-4590-Processor-6M-Cache-up-to-3_70-GHz
Thank you
If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.