Dear Biostars, I'm analysing RNA-seq data with Limma-Voom and after reading and apply different methods of analysis I have arrived to two different results and I'm not sure which one is correct. Here the specs: n=60 (~30 per tissue) The experiment (Multi-level time course experiment):
dataForAnalysis = data.frame(Sample=c("s_1", "s_1", "s_2", "s_2", "s_1", "s_1", "s_2", "s_2"), Tissue=c("A", "A","A", "A", "B", "B", "B", "B"), Condition=c("before", "after","before", "after","before", "after","before", "after"))
Sample Tissue Condition(Sampling time)
s_1 A before
s_1 A after
s_2 A before
s_2 A after
s_1 B before
s_1 B after
s_2 B before
s_2 B after
Analysis No. 1
designMatrix = model.matrix(~ Tissue + Condition, dataForAnalysis)
nf = calcNormFactors(counts)
forDgeData.voom <- voom(counts, designMatrix, plot=False, lib.size=colSums(counts)*nf)
forDgeData.dups = duplicateCorrelation(forDgeData.voom, designMatrix, block= dataForAnalysis$Sample)
forDgeData.fitRobust = lmFit(forDgeData.voom, forDgeData.voom$design, correlation= forDgeData.dups$consensus, block= dataForAnalysis$Sample, method='robust')
forDgeData.ebayes = eBayes(forDgeData.fitRobust, robust=True)
Using this method, these are my results:
summary(decideTests(forDgeData.ebayes))
(Intercept) TissueA ConditionAfter
-1 1615 6594 22
0 446 1573 15631
1 13631 7525 39
From these results, I'm taking the ones in the "ConditionAfter" to get the DE genes between the conditions using:
toptable(forDgeData.ebayes, coef="ConditionAfter", ...)
Analysis No. 2 (For this analysis I pasted tissue and condition for the design matrix and use contrasts)
predesignMatrix = factor(paste(dataForAnalysis$Tissue, dataForAnalysis$Condition, sep='.'))
designMatrix = model.matrix(~ 0 + predesignMatrix)
nf = calcNormFactors(counts)
forDgeData.voom <- voom(counts, designMatrix, plot=False, lib.size=colSums(counts)*nf)
forDgeData.dups = duplicateCorrelation(forDgeData.voom, designMatrix, block= dataForAnalysis$Sample)
forDgeData.fitRobust = lmFit(forDgeData.voom, forDgeData.voom$design, correlation= forDgeData.dups$consensus, block= dataForAnalysis$Sample, method='robust')
forDgeData.ebayes = eBayes(forDgeData.fitRobust, robust=True)
designMatrix.forContrast = makeContrasts(diff_Before_vs_After = (A.after-A.before)-(B.after-B.before), levels= designMatrix)
forDgeData.fitRobust.fitRobust2 = contrasts.fit(forDgeData.fitRobust, designMatrix.forContrast)
forDgeData.ebayes = eBayes(forDgeData.fitRobust.fitRobust2)
Using this method, these are my results:
summary(decideTests(forDgeData.ebayes))
diff_Before_vs_After
-1 7
0 15680
1 5
So, comparing results from Analysis No. 1 and No. 2, only 3 genes overlap. My main question here is if there are DE genes between the conditions (before and after), taking into account the tissue factor and also that the samples are duplicated.
Any help or advice... more than welcome!
Thanks