Limma contrast matrix help
1
1
Entering edit mode
9.3 years ago
chris86 ▴ 400

Hi

I thought I had contrast matrices and limma worked out, but it appears I don't. The data frame is a series of columns (samples) with rows (genes) and I am splitting this up according on the different groups to compare which are in order.

len1 <- 7
len2 <- 12
len3 <- 5

group <- factor(c(rep(1, times = len1), rep(2, times = len2), rep(3, times = len3)))

design <- model.matrix(~0+group)
matrix <- data.matrix(data6)
colnames(design)[1] <- 's1'
colnames(design)[2] <- 's2'
colnames(design)[3] <- 's3'

contrast.matrix <- makeContrasts('s1-s2','s2-s3','s1-s3', levels = design)

fit <- lmFit(matrix,design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)

top2 <- topTable(fit2,coef=1,number=Inf,sort.by="P")
sig <- subset(top2, adj.P.Val<0.05)
nrow(sig)

This gives me 400 differentially expressed genes. All I want to do is compare s1 gene profile with s2. However, when I exclude s3 (remove it from the data frame) and just run the following:

group <- factor(c(rep(1, times = len1), rep(2, times = len2)))
design <- model.matrix(~group) # from EdgeR
matrix <- data.matrix(data6)

#

fit <- lmFit(matrix,design)
fit <- eBayes(fit)
top2 <- topTable(fit,coef=2,number=Inf,sort.by="P")
sig <- subset(top2, adj.P.Val<0.05)

This gives a 1000 differentially expressed genes.

So the two methods are different, but all I thought I was doing in the first step was comparing 2 gene profiles so why the difference?

Thanks

next-gen • 5.1k views
ADD COMMENT
0
Entering edit mode

On the second linear model fit, did you subset your input matrix to only include s1 and s2 samples? - The other thing is that you have a non intercept model in the first design, and an intercept model on the second. That's most likely where you're seeing differences.

ADD REPLY
0
Entering edit mode
9.3 years ago

It's likely that shrinkage changed due to the removal of the samples from group 3, so put them back in the model and do:

design <- model.matrix(~group)
# ...stuff...
top2 <- topTable(fit,coef=2,number=Inf,sort.by="P")

and see what you get.

ADD COMMENT
0
Entering edit mode

Yes the S3 group is a lot more variable and it is effecting the estimates of DE genes between S1 and S2 because the variance is being shared. Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 1595 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6