Limma experiment design and making contrasts
1
0
Entering edit mode
3.6 years ago
kra277 • 0

Hi,

I am a novice working on a 450k methylation array analysis. I have a very simple design which is to see the differentially methylated genes b/w smoking (1) vs non-smoking (0). This is the following I did.

# using smoking_primary as the factor in interest
design <- model.matrix(~0 + smoking_primary)

# Make contrasts 0 is the control and 1 is the test
contrast <- makeContrasts(smoking_primary0 - smoking_primary1, 
                          levels = design)

# fit to methyaltion set
fit <- lmFit(m_norm_qc, design)
fit2 <- contrasts.fit(fit, contrast)
fit2 <- eBayes(fit2)

## Add the annotations to the results
ann450kSub <- ann450k[match(rownames(m_norm_qc),ann450k$Name),
                     c(1:4,12:19,24:ncol(ann450k))]

DMPs <- topTable(fit2, num=Inf, coef=1, genelist = ann450kSub)

Could you please review this and tell me if it is the correct way to do the analysis?

In addition, how should I approach adding covariates to my design? If you could point me to the resource where I could get more info that would be very helpful. I checked the limma manual but it seems a little confusing for a simple design like mine.

Thank you for your time on this post.

limma methylation 450k • 1.9k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode
3.6 years ago

It seems generally okay. For the contrast, you may want to instead use:

contrast <- makeContrasts(
  smoking = smoking_primary1 - smoking_primary0, 
  levels = design)

That is, we assign a name, smoking, to the contrast, and we make 1 the numerator and 0 the denominator (for fold change derivation).

Later when you run topTable(), I am of the belief that it is 'safer' to refer to coefficients by name; so, you'd use:

DMPs <- topTable(fit2, num = Inf, coef = 'smoking', genelist = ann450kSub)

֎֎֎֎֎֎֎֎֎֎֎֎

With regard to covariates, these are added when you create the design:

design <- model.matrix(~0 + smoking_primary + BMI + sex + income)

Then, to adjust for these, you simply derive test statistics for smoking_primary as you did previously. The inner workings of limma will do the remainder (the adjustment(s) for covariates) for you.

Kevin

ADD COMMENT
0
Entering edit mode

That is very insightful. Thank you very much for the answer. Also, if I may ask, could you please point me to the articles for understanding the usage of design and contrasts?

Thanks again for your time

ADD REPLY
0
Entering edit mode

Hi, these follow the same principles as formulae used in regression modelling, so, you may want to focus on that (when searching). What limma is doing is running independent models of the form:

gene1 ~ 0 + smoking_primary
gene2 ~ 0 + smoking_primary
gene3 ~ 0 + smoking_primary
et cetera

That is, it's a linear regression.

ADD REPLY
0
Entering edit mode

Thank you very much for this. Much appreciated.

ADD REPLY

Login before adding your answer.

Traffic: 1067 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6