I have a dataset that looks like this (3 samples in total):
mRNA Cancer-Type-1 Cancer-Type-2 Normal
-----------------------------------------------------------
mRNA1 30 49 12
mRNA2 199 200 78
... ... ... ....
mRNA1000 13 40 88
Hence we'd like to compare Cancer-1 with Normal
and Cancer-2 with Normal
.
Since the number of samples are very small I wonder what's the best way to:
- Normalize the values, and;
- Identify which mRNA that are significantly up/down regulated when each cancers are compared with normal. This is typically problematic because, classification method like Hierarchical Clustering or KNN require large samples.
What was the array platform? How have the data been processed so far? Was there a larger batch of samples from which these three were drawn that could be used for normalization? Or, did you really just run three arrays? The answers to those questions would affect normalization strategy. For sig up/down regulated genes you might not be able to do much better than sorting genes by fold-change. Any statistics you try to apply will probably be misleading and nothing would survive any kind of multiple testing correction.
Thanks. The platform is "TORAY". No processing have been done, it is raw data, no larger batch samples and this is all there is. Advice on the best way to do normalization and DE will be greatly appreciated.
How many biological replicates do you have for each condition?