I have the model:
~group+batch+age+BMI
which I have been using in LIMMA, where "group" is the biologically interesting thing and the others are confounders. What is the best method/package to use to remove the effects of the confounding covariates prior to PCA, so that I know I am only witnessing the biologically relevant clustering?
I have noticed several of the packages available in R either assume only one confounding covariate or only deal with categorical variables.
The only bit I could find talking about continuous variables:
"By default, all adjustment variables will be treated as factor variables by the ComBat function. If you would like to include continuous adjustment variables, also create a vector containing the column numbers of the continuous covariates in the model matrix. This vector must then be input into ComBat via the numCovs option.
We now apply the ComBat function to the data, using parametric empirical Bayesian adjustments.
...
This returns an expression matrix, with the same dimensions as your original dataset. This new expression matrix has been adjusted for batch. "
So this is talking about including them in the model to make the batch effect removal more precise - not about removing the effect of the continuous variables. Please correct me if there is something else I missed.
Hi Scott, I'm not sure if this will be helpful, but I just stumbled on this cbcbSEQ package, which might be of use to use. In particular, check out the vignette (PDF) where they are using a modified ComBat function (
combatMod
). Maybe this is closer to what you are after?hey ..? did you find a solution, I am wondering the same thing..??
:)
Sorry for the late response, haven't been on here in a while and the alert must have been filtered out of my inbox. Unfortunately I couldn't find a solution so I had to just go ahead and use comBat to remove the categorical batch effect and allow the effect of the continuous variable to persist.