Computation requirement for wheat genome
1
0
Entering edit mode
2.3 years ago
Deepak Tanwar ★ 4.2k

I have 5.5 TB of re-sequencing data from the wheat genome. In total, there are 41 samples.

How much computation power would I need for QC and trimming, alignment, de-duplication, and variant calling? I want to run all these analyses in maximum 2 months.

Does this sounds optimal?

  • For 1 sample: 12 cores with 24GB RAM per core (12/288; cores/RAM)

  • For 41 samples: 41* 12/288 = 492/11808 (cores/RAM)

Thank you very much in advance for your suggestion.

computation • 524 views
ADD COMMENT
1
Entering edit mode
2.3 years ago
GenoMax 147k

Depends on whether you want to run all samples in parallel or in batch mode with a certain number at one time. Since you have a limit of 2 months that will probably dictate the method. 5.5TB is not the issue since you will need a certain amount of RAM for wheat genome indexes. Which program are you planning to use (bwa?).

You should bench mark a couple of samples with different hardware allocation to determine what is a reasonable amount needed to finish that sample in a reasonable time (not sure if finish would mean complete analysis you mention above). If you are not limited by available hardware i.e. you can use cloud resources then the limiting factor would be what you are willing to spend rather than time.

ADD COMMENT

Login before adding your answer.

Traffic: 1572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6