I have a transcriptomic dataset that has three batches, when PCA analysis, there is intra batch effect in batch 1. Therefore we repeated the experiment in three new batches, using the same samples but random their orders so they randomly allocated to the new batches. The result of the repeated experiment is consistent with the first one. However, there is still intra batch effect again in new batch 3(while the sample and their orders are totally different).
My question is, how to prove this effect is random so that the result is not affected by this intra-batch?
So far, I have use linear model: expression ~ batch to calculate median of residual of each gene, and calculate the pearson correlation of sharing gene's residual of two experiments, and results in pearson r = 0.3756546 and p-value < 2.2e-16. But I think I have understand something wrong here...
Thanks a lot!
I'm not privy to how these issues are tackled but I found this paper helpful https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3880143/#!po=21.4286
Great link - thanks. I will have to read.