After homogenizing microarray gene expression data with a batch effect correction algorithm, such as SVA or ComBat, how do you evaluate its effectiveness?
After homogenizing microarray gene expression data with a batch effect correction algorithm, such as SVA or ComBat, how do you evaluate its effectiveness?
Unless you are working with simulated data, it may require some subjective assessment. Some scenarios could be the following. If you've reviewed the uncorrected data and you have known batch driver factor(s) (such as date of run, source of material, batch of reagents, etc.) in your experimental design then assessment may be clear: you could assess the amount of variance associated with those batch factors - it should be reduced or eliminated after batch correction. If your batch drivers are not known, then I think you may need to rely on subjective sanity checks. For example, does the batch correction greatly reduce differential expression between experimental groups that should be having a large effect (bad)? Do any batch correction adjustments correlate with experimental factors of interest (also probably bad)? If you have known spikes or true positives, are they still differentially expressed after the batch correction (good)? Do the downstream results that you care about in your experiment (clustering, classification, gene-centric biological conclusions) radically change after the correction (could be good or bad)?
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi Ahill, thanks for your answer. In case you have known batch driver factors, how do you assess the amount of variance associated with those factors? Thanks! Maria