To give you an actual idea of configuration you will minimally need to do this in cloud.
A. If you choose to use mapping-based method like salmon
(or kallisto
) which uses transcriptome sequence
For salmon
(probably similar for kalisto
) There are two ways to do the alignment. One to just transcriptome. For that will need ~4 GB of RAM for each sample. It is generally recommended that you include genome-decoys so that bumps the memory requirement up to ~20G RAM for human/mouse genomes. This is for 1 sample. If you want to run multiple samples in parallel then you will need to multiple this requirement by number of samples you want to run in parallel.
B. Using an aligner like STAR
or bbmap
with genome sequence.
You will need about 40G of RAM to do create genome indexes/do alignments. This requirement is for one sample. (note: subread
aligner can work in ~8 G of RAM but that may be the only splice-aware aligner that can).
For either method you will need cloud disk storage. 200G will be taken up by your data plus some space for programs you need. You will need space for temporary files/genome indexes/output results. Figure on having at least one TB available. There are charges to move data into and out of the cloud so keep that in mind. You will want to have at least 8 cores available.
If you have never used cloud before then it will take some time to familiarize yourself with everything, so that will add time/cost. There are calculators on AWS/Google that can allow you to estimate costs but use them as a rough guide.
Note: If you have a reasonably new laptop with 8 (preferably 16G) of RAM you may be able to do the analysis (at your pace) locally. That would save the money for cloud expense as others have noted.
Honestly, I routinely do RNA-seq analysis (with all those things you mentioned) on my laptop (a 2018 Macbook pro). Why do you need a server?
How long will you take
How many samples? But yeah, as said on the other comment, just get any computer running and process your samples. RNA-seq quantification is trivial for end users these days. I prefer
salmon
but the mentionedkallisto
is also fine, both are super fast and memory-efficient.Cause my lab want me to use a server but I don't know which to choose.
It really doesn't matter. If I can process dozens of samples on a Macbook pro, any server (or laptop) should suit your purpose. Modern tools (e.g. kallisto) can process your samples super fast with very little memory requirements.
I processed 200,000,000 reads in less than an hour on Google Colab once.
In the time that you waste trying to figure out what server to buy, you could have already finished your RNAseq analysis on your laptop.