My lab is about to receive about WGS (30X) data (fastqs) on about 400-1500 samples (depending on how many we get at a time) that need to be processed and analyzed. We are trying to figure out the best route for computational resources. Here are some options:
- We can buy an allocation from Google Cloud Platform for $1,146 per month (assuming it would be on 24/7 since Google charges by the minute), which would include:
- 32-core vCPU with 120GB of RAM
- 1.5TB of SSD space
- We can purchase a node through our university for about $700 per month, however we would be limited to 14,000 core hours per month.
- 20 cores
- 256GB RAM
- We can purchase a system through Dell for about $15-20,000 which has 100+ cores, a few TB of HDD space (we already have NAS storage), and a 3 year warranty.
Now, it seems as though there are pros and cons to each of these choices. For example: purchasing a server directly from Dell would have hidden costs such as upkeep, and would depreciate over time, whereas renting from GCP or our university would be constantly upgraded. On the other hand, in the long term, it seems as though purchasing a system is more economical (granted our computational needs don't significantly raise over the next 3 years)
Are there any options that I may be missing? Also, am I over/underestimating our computational needs for the type of analysis?