Any one here using a laptop/server/cluster with a Solid State Drive for bioinformatics data processing ? A friend recommended to use them for the data crunching as the speed and efficiency of I/O calls will be faster on SSD comparing to HDD. Any experiences or thoughts?
I have tried a few times optimizing data processing via a virtual file systems stored in RAM. These are supposedly providing the fastest I/O. I have not observed a substantial performance increase in tasks relating to mapping reads to the genome (somewhat disappointingly).
Other applications may show different performance characteristics. I would test the performance change with tmpfs first because it is easy to set up.
SSD may help only if the software is consuming significant time waiting for disk. Especially if you have lots of small random access IO requests Random access time of SSD is much better compared to spinning disks.
If you access the data serially, then the throughput of HDD is quite good and you do not get big improvements changing into SSD. For example, Oracle gets over 10 GB/s throughput from HDD in their Exadata database machine.
Thanks for your points Jussi. I was looking at a process where I have a lot of small random access IO requests. I was curious to know if someone is already running bioinformatics applications on a SSD, because machines with SSDs are very costly comparing to same configuration with HDD.
Thanks for sharing your thoughts.
+1 for the simple test. On modern Linux no setups is necessary. Out of the box /dev/shm/ (tmpfs) will be mounted - it addresses half of the RAM.