I was trying to do background correction using RMA for my GEO dataset, since the dataset was produced using Affymetrix array (GPL570). And preprocessing using limma package in R.
I got this tutorial for normalization From Data import to Normalization in Microarray Analysis using in R (Part I)
But unable to know, how to process this normalized output and give it as an input to identify the differentially expressed genes using limma. I have written the following code, but it returns 0 deg (topTable)
library("limma")
library("affy")
library("gcrma")
setwd('E://GSE...7/')
d1<-ReadAffy()
data.rma <- expresso(d1,bgcorrect.method="rma",normalize.method="quantiles",pmcorrect.method="pmonly",summary.method="medianpolish")
eset <- exprs(data.rma)
sample <- factor(rep(c("Case","Cont"), each = 10))
design.mat <- model.matrix(~0+sample)
colnames(design.mat) <- levels(sample)
design.mat
fit <- lmFit(eset,design.mat)
fit3 <- eBayes(fit)
deg <- topTable(fit3, coef = 2, p.value = 0.05, adjust.method = 'BH', number = nrow(eset), lfc > 2)
When I remove lfc criteria, I get genes with very less fold change value, none are above 2. On the contrary, when I perform mas5 background correction I get high fold change values.. How do I get the actual fold change values?
You are not reading in any data (?) Which GEO dataset are you using?
I have set the path before d1<-ReadAffy(). d1 is not empty. I am using GSE4757 dataset
Okay. What is the lowest P value obtained when you just run:
Also, the argument that you pass to lfc is expected to already be on the log (base 2) scale, i.e., you don't have to log the value yourself with
lfc = log2(1.5)
(it would just belfc = 1.5
)I have modified the code a bit, I am not getting a proper fold change values.. using RMA my fold change is going down and using Mas5 my fold change values are going high. How do I correct the fold change values?
Can you actually show the fold-changes obtained by the different normalisation strategies?, like, plot the sorted / ordered fold changes with
plot()
. Can you also show the boxplot of your normalised data?Go here to learn how to share images: How to add images to a Biostars post
Also, you should consider using oligo, and not affy. For some Affymetrix arrays, oligo is actually the better option.
I couldn't plot the whole logFC, but following is the head(genes) in descending order of logFC
I think my design matrix is not proper, it should be like
correct this if I am wrong