Question

Limma covariates treatment is changing my significance results

0

Entering edit mode

3 months ago

ijarne ▴ 10

I have a differential expression analysis in limma which is quite simple at first glance. I have 3 treatments distributed in 3 different tissues. All treatments are in all the different tissues. I would like to take into account the differentially expressed genes between tissues taking into account the covariate "tissue". To do so I have tried 2 different approaches:

Generate a "group" variabla pasting the treatment and the tissue and then comparing the different treatments
Treating tissue as a covariate and performing the contrast comparing the treatments.

Now fold changes are allways the same between the same comparisons in different approaches so I know I am performing the same substraction but Pvalues (and adjusted.P values too of course) change. Can someone explain me what's producing this change and which is the most correct approach, I have the sense it has to be the second but I'd like some explanation.

Thanks!

limma differential-expression • 418 views

ADD COMMENT • link 3 months ago by ijarne ▴ 10

score 5 · Accepted Answer · 2024-09-16

You have two different model specifications, which pools the data differently, resulting in different estimates of variance.

Model 1: expr ~ tissue : treatment

Here you have 9 means (3 x 3) and 9 variances

Model 2: expr ~ treatment + tissue

Here you have 6 means (3 + 3) and 6 variances. The values in the interaction term in model 1 have been partially pooled into each of the marginal terms in model 2. There is also the fully-specified model, model 3, which is:

Model 3: expr ~ treatment + tissue + treatment : tissue

which has the full 15 means (3 + 3 + [3 x 3]) and variances, and where marginal effects are partitioned away from interaction effects.

The difference you're observing is that the contrast (under model 1):

c(Tis1_Trt1, Tis2_Trt1, Tis3_Trt2) - c(Tis1_Trt2, Tis2_Trt2, Tis3_Trt2)

between treatments 1 and 2 across all tissues does not fully utilize the information you have about tissue means and variances, treatment 3 contains both tissue effects (which are still relevant to this comparison) and treatment effects (which are not relevant). This data can be appropriately pooled only by models 2 and 3. In other words - to test for a treatment effect across tissues, use model 2 - and if you're worried about a tissue-specific effect driving the results, you can use model 3.