Forum:SSD or HHD for genome analysis
2
0
Entering edit mode
5.1 years ago
Neil ▴ 20

Hello. I need to analysis mostly human whole exomes and far less frequently human whole genomes. And today, I must choose what is better for these purposes - 2x480 SSD or 2x2TB HHD. The current pipeline for exome analysis is very old and is written in Perl. And I'm going to rewrite it with new software. Please, help me to choose what kind of storage do I need for my goals. Also, this hardware will be used for six months.

hardware genome sequencing • 4.0k views
ADD COMMENT
1
Entering edit mode

If this is on workstation rather than cluster level, get some SSDs to be able to productively do things in parallel. SSDs are really cost-effective these days. For archive, get a big external drive like 10TB or so to store things once analyzed.

ADD REPLY
0
Entering edit mode

IMO go for SSD on your laptop and buy a NAS ( a RAID5 25TB NAS costs about 2000€ : a 10TB ~1000€).

ADD REPLY
0
Entering edit mode

You could also consider Hybrid drives. I used to have one of those in the past and it felt like the best of both worlds to me: plenty of cheap storage with the HDD and speed with the SSD.

ADD REPLY
0
Entering edit mode

The hybrid drives will move frequently used files to the SSD. If you are running a new analysis, all those files will be new, so they will probably be relegated to the HDD.

ADD REPLY
0
Entering edit mode

Sure, not everything can be put on the SSD of hybrid drives because they have far more limited capacity than full SSD drives. In the context of bioinformatics, it could be worth it for, lets say, the scripts and libraries that are often accessed, small databases, perhaps the indexed genome for read mapping, etc

IMHO, hybrids are a good balance of cost, capacity and performances, but in the end, it all comes down to user needs and budget.

ADD REPLY
2
Entering edit mode
5.1 years ago

Interestingly, there's a whole paper about this! https://academic.oup.com/bib/article/17/4/713/2240499/

It looks like some programs sped up significantly, others had no improvement. Personally I'd rather go for space than for speed (so the 2x2TB HDD), but I work with massive plant genomes, I don't know what scale of data you will be working with!

ADD COMMENT
0
Entering edit mode

Most of the time I work with human exome sequence data which has the size from 1 GB to 10 GB (pair-end) and human whole-genome sequence data with the size of 100 gb (pair-end).

ADD REPLY
0
Entering edit mode
5.1 years ago
SaltedPork ▴ 170

If this is for a Desktop/Workstation then I would install your OS and pipelines on an SSD. Once an analysis is complete, move the data onto HDDs for backup.

ADD COMMENT

Login before adding your answer.

Traffic: 1793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6