Hi
I thought I had contrast matrices and limma worked out, but it appears I don't. The data frame is a series of columns (samples) with rows (genes) and I am splitting this up according on the different groups to compare which are in order.
len1 <- 7
len2 <- 12
len3 <- 5
group <- factor(c(rep(1, times = len1), rep(2, times = len2), rep(3, times = len3)))
design <- model.matrix(~0+group)
matrix <- data.matrix(data6)
colnames(design)[1] <- 's1'
colnames(design)[2] <- 's2'
colnames(design)[3] <- 's3'
contrast.matrix <- makeContrasts('s1-s2','s2-s3','s1-s3', levels = design)
fit <- lmFit(matrix,design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
top2 <- topTable(fit2,coef=1,number=Inf,sort.by="P")
sig <- subset(top2, adj.P.Val<0.05)
nrow(sig)
This gives me 400 differentially expressed genes. All I want to do is compare s1 gene profile with s2. However, when I exclude s3 (remove it from the data frame) and just run the following:
group <- factor(c(rep(1, times = len1), rep(2, times = len2)))
design <- model.matrix(~group) # from EdgeR
matrix <- data.matrix(data6)
#
fit <- lmFit(matrix,design)
fit <- eBayes(fit)
top2 <- topTable(fit,coef=2,number=Inf,sort.by="P")
sig <- subset(top2, adj.P.Val<0.05)
This gives a 1000 differentially expressed genes.
So the two methods are different, but all I thought I was doing in the first step was comparing 2 gene profiles so why the difference?
Thanks
On the second linear model fit, did you subset your input matrix to only include s1 and s2 samples? - The other thing is that you have a non intercept model in the first design, and an intercept model on the second. That's most likely where you're seeing differences.