Hi,
I want to purchase workstation for analyse illumina reads of plant genome (Read mapping and SNP calling).What minimum hardware configuration required???
Hi,
I want to purchase workstation for analyse illumina reads of plant genome (Read mapping and SNP calling).What minimum hardware configuration required???
Minimum hardware specs are hard to give without further information about what is acceptable running time. Most analyses would run on old or cheap harware, but would simply take terribly long. The most important spec is memory size. Your genome, as indexed by the read mapper should fit in memory, that's what can be said without further details of genome size and software. Hardware Suitable For Generic Nextgen Sequencing Processing? seems to still be valid, just double the RAM figures. Otherwise, get as many CPU cores and fastest IO as you can afford. You will need a lot of disk space as well.
Other things to consider:
To echo what Michael posted: You'll need to be quite clear on the size of your genomes (you'll need something with a lot of RAM and storage memory for plant genomes) and what you want to do at your workstation (will you be doing transcriptome assembly or just SNP calling?). Once you have an exact idea of what you'll be doing and the time frame you need for your analysis, then you can plan for the specs of your workstation. Storage For Miseq In-House may be some help in addition to the one that Michael posted.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks for respond,
I want minimum hardware configuration for mapping and SNP calling. Can you suggest ideal Hardware configuration for plant genome data of illumina.I dont have any existing server but planning to buy.
minimum or ideal? that makes a difference. Also, what software 'exactly' are you going to run? What is the size of the largest genome you are working with? How many users in parallel will use the machine? You should use a Linux based server, will you use it interactively, or via ssh/telnet?
To find the minimum/optimal RAM size: get hold of/borrow a high-memory server, e.g. on amazon cloud, run a typical large job of the alignment step and monitor the process and its memory usage (e.g. 8GB), double the maximum memory required by that process and buy a decent computer that has this amount of RAM installed (e.g. 16 GB)and can host at least double this amount (e.g. 32GB, better up to 128GB) for later upgrades.
Thanks Michael
I want generalize configuration for Plant genome illumina reads mapping and SNP calling.Can i do this using 4Quadcore, 16 GB RAM and 1TB storage computer.
Depends exactly on what you want to do (read Michael's post above on finding minimal RAM size). The specs you have listed above are not nearly enough for what you want to do.
Well, RAM and CPU might be sufficient, but the system would very soon run out of storage for the read data. The memory might be just sufficient. From the BWA manual: "With bwtsw algorithm, 2.5GB memory is required for indexing the complete human genome sequences. For short reads, the ‘aln’ command uses ~2.3GB memory and the ‘sampe’ command uses ~3.5GB." The largest sequenced plant genome is barley (5.1Gb, if that scales linearly, 6GB should be enough), if it is one of the smaller plant genomes, then it might even work with less. All assuming the intended pipeline used BWA for read mapping.
Thanks Michael, Can i call SNPs using configuration stated in previous post.
No warranty, but most likely the analysis would work, but you will have no space to store it. You need to buy additional storage very soon, because your 1TB disk can be full after a few runs, assuming 100GB per run. Even the smallest compute solution offered by illumina has 20TB of disk space. http://www.illumina.com/documents/products/datasheets/datasheet_illuminacompute.pdf
Thanks Among from above configuration I've mentioned, RAM & Processors (16GB & 4 Quadcore) are enough for SNP calling. Please suggest. I have to increase storage capacity.
Average plant genome size right now is in the range of 6Gb to 8Gb. I might have a skewed view of RAM & CPU since the plant genomes I work with are in the range from 4Gb to 20Gb. If we knew what plant we were talking about and had an estimate of the genome size, we could give a little more information to you, vaibhavbarot.