I have counts data for 18 samples, two conditions, sequenced in two batches. I would normally run DE analysis with batch as covariate.
A density plot on the scaled CPM data shows a bimodal distribution. Density plots of the separated batches show single peak more-or-less normal distributions so my conclusion is that the bimodality is caused by the batches.
I know that for running batch correction using ComBat you need to select non-parametric correction if the distribution isn't normal, because the regular parametric method assumes a normal distribution. I also know you're supposed to run DE on the raw counts without corrections so I can't use the batch corrected data. What I do not deeply understand is how DE programs subtract out the covariates and if they need a normally distributed dataset to do so.
So my question is: Can I run EdgeR or DESeq2 with batch as covariate the way I normally do? Or will the bimodality cause issues?
Hello, could you please add the code and resulting figure to your post, it is difficult to follow just based on textual description. Thanks.