Question

Normalization And Differential Expression From Array With Small Samples

1

Entering edit mode

12.5 years ago

gundalav ▴ 380

I have a dataset that looks like this (3 samples in total):

 mRNA     Cancer-Type-1   Cancer-Type-2  Normal
-----------------------------------------------------------
mRNA1      30        49    12
mRNA2     199        200   78
...        ...       ...  ....
mRNA1000   13        40    88

Hence we'd like to compare Cancer-1 with Normal and Cancer-2 with Normal. Since the number of samples are very small I wonder what's the best way to:

Normalize the values, and;
Identify which mRNA that are significantly up/down regulated when each cancers are compared with normal. This is typically problematic because, classification method like Hierarchical Clustering or KNN require large samples.

microarray normalization • 3.2k views

ADD COMMENT • link updated 12.5 years ago by vibhanim.21 ▴ 40 • written 12.5 years ago by gundalav ▴ 380

0

Entering edit mode

What was the array platform? How have the data been processed so far? Was there a larger batch of samples from which these three were drawn that could be used for normalization? Or, did you really just run three arrays? The answers to those questions would affect normalization strategy. For sig up/down regulated genes you might not be able to do much better than sorting genes by fold-change. Any statistics you try to apply will probably be misleading and nothing would survive any kind of multiple testing correction.

ADD REPLY • link 12.5 years ago by Obi Griffith 20k

1

Entering edit mode

Thanks. The platform is "TORAY". No processing have been done, it is raw data, no larger batch samples and this is all there is. Advice on the best way to do normalization and DE will be greatly appreciated.

ADD REPLY • link 12.5 years ago by gundalav ▴ 380

0

Entering edit mode

How many biological replicates do you have for each condition?

ADD REPLY • link 12.5 years ago by colinDotAIBN ▴ 20

score 2 · Answer 1 · 2013-01-31

I have zero experience with TORAY platform data (never even heard of it before). Google turns up surprisingly little. I would start by just plotting distributions of your raw data for each sample and maybe comparing MA plots to get a general sense of how the data look. You might consider quantiles normalization (e.g., the R package normalize.quantiles {preprocessCore}). To determine significantly up or down-regulated genes you can look at this paper for ideas. There are probably others. But honestly with just one sample versus two I think you can forget about statistics. I would just calculate fold changes and maybe bin according to absolute expression level so that you treat a 2/1 ratio with more skepticism than a 2000/1000 ratio.

score 1 · Answer 2 · 2013-02-02

1

Entering edit mode

12.5 years ago

vibhanim.21 ▴ 40

I have not worked on this platform before, but had done a similar experiment of comparing the 2 data sets. All I did was plotted the MA plots and the quantile normalization. performing t-test had helped a lot and was later compared.

ADD COMMENT • link 12.5 years ago by vibhanim.21 ▴ 40