Dear all,
After some attempts with DESEQ2, I have now moved to Limma to solve my issue with 3 samples per individual in my Differential Expression Analysis. I have to adjust my model for age, sample site and diagnosis. I am interested in differentially expressed genes in sex (male versus female).
I followed this tutorial under "Limma Analysis": https://bioc.ism.ac.jp/packages/3.14/bioc/vignettes/variancePartition/inst/doc/dream.html
I wonder if this code makes sense now, I dont quite understand why sometimes people put "~0 + AGE.GROUP + ..." What is the +0 doing, is it fine without?
# apply duplicateCorrelation is two rounds
design = model.matrix( ~ AGE.GROUP + SAMPLE.SITE + DIAGNOSIS + SEX, metadata)
vobj_tmp = voom( geneExpr, design, plot=FALSE)
dupcor <- duplicateCorrelation(vobj_tmp,design,block=metadata$INDIVIDUAL)
# run voom considering the duplicateCorrelation results
# in order to compute more accurate precision weights
# Otherwise, use the results from the first voom run
vobj = voom( geneExpr, design, plot=FALSE, block=metadata$INDIVIDUAL, correlation =dupcor$consensus)
# Estimate linear mixed model with a single variance component
# Fit the model for each gene,
#dupcor <- duplicateCorrelation(vobj, design, block=metadata$INDIVIDUAL)
# But this step uses only the genome-wide average for the random effect
fitDupCor <- lmFit(vobj, design, block=metadata$INDIVIDUAL, correlation=dupcor$consensus)
# Fit Empirical Bayes for moderated t-statistics
fitDupCor <- eBayes( fitDupCor )
Thank you so much, Bine
Thank you very much for this explanation.