Hi,
I have to design a relational database to store the results of gene expression experiments (both microarrays and RNA-seq), using MySQL.
I have a number of experiments, and for each experiment several contrasts/comparisons. For each contrast, I have thousands of genes with fold changes and p-values that I want to store. The database will be queried to retrieve genes showing differential expression across comparisons and across experiments.
What's the best design for such a scenario? I'm thinking to create one table per experiment but I'm not sure this is the best way to go about it. I'd be curious to hear from people who have experience in designing relational databases or who use SQL to query databases containing gene expression data.
Thank you!
Using a database system to store genes with just p-val and fold-change seems a bit overkill. What's the size of your data? If it is less than few hundred megabytes, you could probably just load the entire thing into memory to serve to the user via a simple webserver framework like bottle.py or node.js.
Thanks for the suggestion Damian.
That's not what I need though. I'm oversimplifying the situation in my question. In reality I'll have several columns to store for each comparison and experiment. Besides, this will be part of a larger projects and has to be a relational database.