Making a contrast matrix with multiple co-variates for differential expression analysis
0
0
Entering edit mode
5.2 years ago
dodausp ▴ 190

Hi, there!

I am trying to build a contrast matrix, in order to run a fit linear model. It is a basic comparison between different histologic types of tumors - benign or BL; early stage; late stage. And the goal here is to investigate whether FFPE (formalin-fixed) material differs from fresh-frozen material in terms of methylation pattern (we're using the illumina's EPIC). To that end, we collected FFPE and fresh-frozen samples from the same patient.

The basic experiment looks something like this:

> clindata
   Subject Material_Source  Tumor_stage        ID2
1     P235            FFPE Benign_or_BL  P235_FFPE
2     P432            FFPE Benign_or_BL  P432_FFPE
3     P421            FFPE        Early  P421_FFPE
4      P93            FFPE        Early   P93_FFPE
5     P876            FFPE        Early  P876_FFPE
6     P543            FFPE         Late  P543_FFPE
7     P532            FFPE         Late  P532_FFPE
8     P152            FFPE         Late  P152_FFPE
9     P235           Fresh Benign_or_BL P235_Fresh
10    P432           Fresh Benign_or_BL P432_Fresh
15    P421           Fresh        Early P421_Fresh
16     P93           Fresh        Early  P93_Fresh
17    P876           Fresh        Early P876_Fresh
24    P543           Fresh         Late P543_Fresh
25    P532           Fresh         Late P532_Fresh
26    P152           Fresh         Late P152_Fresh

Where clindata$Subject refers to patient ID; and the following 2 columns refers to the source of material and tumor stage, respectively. clindata$ID2 is a merge between values in clindata$Subject and clindata$Material_Source.

So, now comes my question: How to build the contrast matrix for comparison between different tumor stages, but accounting for the patient and material source variables?

My idea is the following:

#preparing data:
> TS <- factor(clindata$Material_Source)
> SubMS <- factor(clindata$ID2)

#designing the matrix:
design <- model.matrix(~0+Tumor_stage+ID2, data=clindata)
colnames(design) <- c(levels(TS), levels(SubMS)[-1])

I can run the lmFit() and makeContrasts() functions after that, together with the array data. Now, of course the n for each group is rather small, but this is just an example (there will be more samples added to each group on the final experiment). But my question is:

Does that design make sense?

Would you suggest anything different (e.g. (A) treat all 3 classes separately, instead of merging the 2 co-variates as one co-variate; or (B) consider only the "Subject" group as a co-variate, since the pairwise comparison would already account for one sample being FFPE and the other fresh-frozen)?

Any help is greatly appreciated here. Thanks!

R limma differential expression linear model • 921 views
ADD COMMENT

Login before adding your answer.

Traffic: 2600 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6