Hello everyone, I am new in microarray data analysis. recently I want to use Limma to find different expressed genes between grade I and grade III, which you could see from the annotation file. but when I use the R code below to do this, I only got two different expressed genes, could anybody tell me why? Is there anything wrong with my code? thank you in advance!
#the annotation file, I can't post the entire file because it is too long.
geo_accn sample_name treatment dataset grade
GSM65321 KIT_104B91 tamoxifen KJX64 3
GSM65330 KIT_114B68 tamoxifen KJX64 1
GSM65335 KIT_131B79 tamoxifen KJX64 3
GSM65336 KIT_135B40 tamoxifen KJX64 1
GSM65331 KIT_138B34 tamoxifen KJX64 1
GSM65338 KIT_162B98 tamoxifen KJX64 2
GSM65332 KIT_173B43 tamoxifen KJX64 2
# the R code
rm(list = ls())
library(affy)
library(affycoretools)
library(simpleaffy)
library(gcrma, warn.conflicts = FALSE)
library(hgu133a.db)
library(a4)
datainfo <- read.table("GSE2990_suppl_new.txt", header = T, stringsAsFactors = FALSE)
data_CELnb <- datainfo[,1]
rawAffyData <- ReadAffy(filenames = paste(data_CELnb, ".CEL", sep = ""))
pdata_lung <- datainfo
rownames(pdata_lung) <- sampleNames(rawAffyData)
metaDatalung <- data.frame(labelDescription = colnames(pdata_lung))
phenoDataLung <- new("AnnotatedDataFrame", data = pdata_lung, varMetadata = metaDatalung)
phenoData(rawAffyData) <- phenoDataLung
eset<-rma(rawAffyData)
design <- model.matrix(~ -1 + factor(pData(eset)$grade))
colnames(design) <- c("Stage1", "Stage2","Stage3")
fit <- lmFit(eset, design)
contrast.matrix <- makeContrasts(Stage1-Stage3, levels = design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
fit2$genes$Symbol=getSYMBOL(fit2$genes$ID, "hgu133a")#insert symbol
fit2$genes$EG <- getEG(fit2$genes$ID, "hgu133a")# insert entrezid
fit2$genes$GeneName <- unlist(mget(fit2$genes$ID, hgu133aGENENAME))# insert the gene name
resultsLimmaCommon <- limma:::topTable(fit2, coef = 1, adjust.method = "fdr", number = nrow(eset),sort.by="P")
resultsLimmaCommon_Sig <- resultsLimmaCommon[abs(resultsLimmaCommon$logFC)> 2 & resultsLimmaCommon$adj.P.Val < 0.05, ]
summary(resultsLimmaCommon_Sig)
I would first check the P value distribution in the topTable... to see how the topTable looks like... to check the distribution