Hello everyone, I'm writing here for the first time because I need some help. I need to perform a differential methylation analysis between cases and controls, incorporating surrogate variables as covariates, in addition to the principal components of the phenotypic variables that I have already included. This is the actual model that I use to perform the differential methylation analysis using Limma:
design <- model.matrix(~0+Sample_Group.1 + Dim.1 + Dim.2 + Dim.3 + Dim.4 + Dim.5, data=dataPC)
My doubts are as follows:
Does it make sense to perform an analysis of surrogate variables when the methylation data matrix has already undergone batch effect correction using Combat?
My second doubt concerns the number of surrogate variables to extract and the type of model to use in the "twostepsva.build(dat, mod, n.sv)" command. I don't know if is correct to use the model matrix perform for the differential or use this model:
mod <- model.matrix(Sample_group ~ Dim.1 + Dim.2 + Dim.3 + Dim.4 + Dim.5, data=data_PC)
that I actually used, where "dat" is obviously the beta matrix treated with Combat. My final doubt is whether it is correct to include the extracted surrogate variables in this way as regressors in the methylation model (while testing the collinearity between the surrogate variables and the principal components).