Question

Methods For Comparing Microarrays From Different Datasets

3

Entering edit mode

11.9 years ago

Adam Cornwell ▴ 510

I often run into situations which fall outside the realm of most existing microarray meta-analysis solutions- where I have two sets of arrays to compare (say, RNA from a particular cell type vs whole tissue), but the two sets are from different datasets and sometimes different platforms. Most of the time, a direct comparison is not appropriate because the variability due to the batch effect is greater than that due to the biology. Batch effect compensation methods such as COMBAT aren't appropriate as the batch effect and the target variable of interest are confounding.

So far, I've been normalizing the datasets separately and then compare them using RankProd. I'd like to try a different method, because I've had some complaints about unexpected results in my output genelists and so I'd like to make sure that the output from multiple methods correlates reasonably well so I can have some confidence in presenting the results. After doing a decent search, I haven't come up with much aside from RankProd and METRADISC that's actually semi-advertised as being able to handle this sort of scenario, the latter of which is also rank-based.

I'm starting to get to the point of just wanting to try some thing that sound crazy, like normalizing separately, combining, median-centering/scaling (when there's more than one data set involved), and then transforming to POE (MetaArray package) and following with differential expression testing. Would that actually...work?

Is it appropriate to use an effect size-based method as implemented in GeneMeta for this sort of thing?

I've been holding off on experimenting with this until finding a more solid answer, but it seems like I really need to make some progress on this soon.

microarray differential-expression normalization • 7.1k views

ADD COMMENT • link updated 11.4 years ago by Biostar 20 • written 11.9 years ago by Adam Cornwell ▴ 510

0

Entering edit mode

I highlighted the question; it was a little bit lost in the text :)

ADD REPLY • link 11.9 years ago by Neilfws 49k

0

Entering edit mode

Thanks, ultimately it's about soliciting for suggestions to deal with such a scenario. Really seems like I need to practice refining effective questions!

ADD REPLY • link 11.9 years ago by Adam Cornwell ▴ 510

score 1 · Answer 1 · 2013-05-31

Hi Adam,

There are approaches like meta-analysis across different experiments published online however they seem sophisticated to me. RankProd was suggested to me when by the author but I did not have time and the experience to customise it to my need. What I do usually is normalize them all seperately then I use a cluster analysis using BioLayout with a unique ID for all microarray either using REFseq if they are not from the same affymetrix version or different platform and then check a cluster with pattern of expression of an interest across different expression data. Using a unique ID such as refseq will drop the number of genes that you can test.

I hope this will be handy.

score 0 · Answer 2 · 2013-05-31

0

Entering edit mode

11.9 years ago

ewre ▴ 260

data heterogeneity is a central problem about microarray data of different sources. there are methods try to scale datasets from different labs with control probes or housekeeping gene probes of the same platform, but have little effect.

ADD COMMENT • link 11.9 years ago by ewre ▴ 260