Hi all,
I come to find that the order of variables in generating a design matrix with batch effect is different in DESeq2 and Limma. Just wondering if I have misinterpreted anything, or is this a discrepancy between the two packages.
Here's how I would model batch effect with DESeq2
dds <- DESeqDataSetFromMatrix(countData = df,
colData = metadata,
design = ~ batch + condition)
For Limma I would do the following
design <- model.matrix(~0 + condition + batch)
I'm slightly less confident about the matrix design that I specified for Limma. Just want to ask if this is the correct order in how the variables should be assigned, and whether I should include the 0
preceeded all my variables to be able to specify contrasts, if I am performing pairwise comparisons.
Hi,
On a similar note - i am running limma through DEP2 and found that the order of factors/covariates in the design matrix does affect the final analysis.
This is the code i have used:
I have tested a bunch of iterations of the design formula. Using the same combination of factors/covariates but in a different order affected the number of significantly differentially expressed proteins.
In the image i have a summary table showing the number of significant proteins (proteinnum) and the smallest adjusted p val in the dataset (BH) for the different combinations. If you look at the second last and third last row in the table - the same covariates in a different order produce a different result.
Why might this be?
Thanks heaps
I don't know anything about DEP2 but I can tell you for sure that limma gives the same results regardless of the order of terms in the model (apart from the intercept removal ~0 affecting the first term).
If you are getting different results from the 2nd and 3rd last models, then that indicates a bug either in your code or in DEP2.
Hey Gordan,
Thanks heaps for the prompt reply.