Normalization of Batch effect removal
1
0
Entering edit mode
16 months ago

I am attempting to do an integrated analysis of microarray data. The normalization of each data set is disparate, quantile norm, z-socre, and RMA. Is it still possible to integrate and analyze them with ComBat?

ComBat • 1.5k views
ADD COMMENT
1
Entering edit mode
16 months ago
LauferVA 4.5k

Yes, and other packages:

While newer packages such as ComBat and sva are aimed more at RNA-seq, but the original versions of it and other packages were designed for Microarray data and if I recall correctly still include functionality for microarray data. These approaches will provide information on the best kind of input to provide (raw, normalized, etc.).

See, for instance,

Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics, 8 (1), 118-127

Leek JT and Storey JD. (2007) Capturing heterogeneity in gene expression studies by ‘Surrogate Variable Analysis’. PLoS Genetics, 3: e161.

From the sva reference manual (here):

sva has functionality to estimate and remove artifacts from high dimensional data the sva function can be used to estimate artifacts from microarray data the svaseq function can be used to estimate artifacts from count-based RNA-sequencing (and other sequencing) data.

The ComBat function can be used to remove known batch effecs from microarray data.

The fsva function can be used to remove batch effects for prediction problems.

I have also outlined a "DIY" approach that is a good starting point for many data types: Batch correction for Nanopore RNAseq.

ADD COMMENT
0
Entering edit mode

Hello, VAL! Thank you for your comment.

It is very informative with helpful attachments. I thought that the state of each dataset (raw , normalization, etc.) must be unified before using ComBat.

Now I am thinking that at least the dataset that will be used as a reference when integrating (the dataset that contains all conditions) might be manageable as long as it is normalized.

Thank you for your wealth of knowledge and kind response.

ADD REPLY
1
Entering edit mode

can you process them so that they are all normalized, or all consistent, THEN run ComBat?

VL

ADD REPLY
0
Entering edit mode

The state of the dataset was varied.

Raw signals, quantile norm, Z-scores, RMA, etc., etc.

I am now trying to figure out how to maintain consistency.

But I thought ComBat would adjust other databases based on the reference dataset. If the reference dataset (the dataset that contains all the terms) is normalized, will the other datasets be adjusted as well?

ADD REPLY
1
Entering edit mode

hi yoshi -

combat will return batch corrected values for each input. but it will not convert them to the same data type (I think - I have not checked in a while).

But, at some point in the analysis, you will either need to harmonize them, or you will have to perform all the analyses on different data.

I usually find it easier/better to try to harmonize as many of the datasets as possible to one data type as early as possible; if possible granted the datasets.

so, in this case, to the degree possible, id try to convert the data to the same type of values early on, then run the rest of the pipeline - but, I am not aware of a specific need to do this before running combat...

ADD REPLY
0
Entering edit mode

Thanks for your response! Oh, really? I was mistaken.

Then, for example, it looks like Combat would be a good idea to unify all of them with quantile norm at an early stage.

The one I am analyzing is public database, and the upload included RMA and z-score etc.

It seems to be difficult to convert z-score to quantile norm and so on, It still seems difficult to integrate.

ADD REPLY
0
Entering edit mode

please double check me on that using the SVA/ComBat documentation

ADD REPLY

Login before adding your answer.

Traffic: 1677 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6