Question

Normalization method to be used when dealing with multiple datasets

1

Entering edit mode

4.6 years ago

vipulwagh31 ▴ 20

Hello everyone,

I have multiple datasets from different platforms (Illumina and Affymetrix) and I have already performed a differential expression analysis for these datasets individually. I have used RMA and rsn normalization methods for Affymetrix and Illumina respectively. Now, I plan to perform a meta-analysis for these datasets. But, I am really not sure about the normalization method to be used before combining these datasets in the analysis. Following are my queries regarding the same:

Is it necessary to use the same normalization method for all the datasets before combining them for further analysis?
If yes, which method is recommended in this case? (considering the datasets are from two different platforms)
If no, how can it affect the meta-analysis analysis? (wherein I plan to quantile normalise the combined file before going for differential expression)

I would really appreciate any response to my queries. It would be a great help if could direct me to the appropriate literature. Also, I would appreciate it, if you all can make me aware of the available packages for meta-analysis if any.

Thanks

Normalization Meta-analysis • 1.2k views

ADD COMMENT • link updated 4.6 years ago by ATpoint 88k • written 4.6 years ago by vipulwagh31 ▴ 20

1

Entering edit mode

The idea of meta-analysis is exactly to NOT put them into one statistical DE framework as these are from different platforms with their individual batch effects and technical differences.

You would take the final individual DE results and then perform analysis by comparing the results itself, not the counts or intensities like in an individual DE analysis.

This could be done e.g. using ranked-based methods such as suggested in this methods paper: https://pubmed.ncbi.nlm.nih.gov/22247279/ implemented in the R package https://cran.r-project.org/web/packages/RobustRankAggreg/index.html

Here you would rank each gene list my a metric, for example significance and then compare the ranks between studies to find out which genes are consistently of high ranks (e.g. always upregulated) or or low ranks (consistently downregulated).

ADD REPLY • link 4.6 years ago by ATpoint 88k