Entering edit mode
7.0 years ago
zaynabmousavian
▴
10
Hi,
I have multiple microarray gene expression datasets with the same platform including samples of one tumor type without any control, and I want to integrate all expression datasets obtained from GEO into one and use it for downstream analysis. Is there any need to remove batch effects between datasets? If yes, how can I do this?
Regards
The batch effect would indeed be a major problem. You can remove batch effects when you have sample replicates spread across the different batches (and account for it in the model matrix in limma) but if that's not the case, it may be difficult or impossible since the batch effects could completely confound your biological difference. It sounds like you may have too many sources of non-biological variation like samples coming from different labs, different tissues, different times, chips and scanners etc., even if the data has all been derived using the same platform.
Thanks for your reply. I want to integrate GSE13294 and GSE14333 datasets into one. Could you please take a look at these datasets and let me know is it possible or not?
They are the exact same array type, which is great. Just process them together as a single experiment and then adjust for batch by taking a look here (and following the link to the Bioconductor thread): A: Doubt about batch effect and gene expression analysis