I am trying to correct batch effect using combat.
About 40% of my genes ends up having at least one negative result. If just drop those genes the resulting normalized PCA plot clusters neatly (but I loose 40 % of genes):
I've tried turning them into zeros, but that makes a really bad PCA clustering (especially at the PC1):
Is there a way to not loose almost half the data bit without distorting it too much?
Hi Lahat,
It would be useful to know littlle background of your samples, how you are using ComBat for batch normalization ; before and after boxplot of each sample.
Regards,
Mamun
Hi. the samples are mice RNAseq data from several treatments. The sequencing was done in three batches. two batches (1 and 2 was done using truseq method), and batch 3 was done using gencore method.
Without normalization there is a very strong batch effect between methodologies:
Here is the code: