Unreliable results of R while retrieving DEG
0
0
Entering edit mode
8.4 years ago
786 ▴ 50

Hi I was working on GSE55650 and received different number of differentially expressed genes just by changing the group format. Although the control and diseased were same in both types of groups but still get different number of genes. I just don't understand why is it so?Can anyone help me in understanding this?I'll be very thankful.

First situation:
groups = pData(phenoData(gse55650dat[[1]]))$Disease_states
groups=as.character(groups)
groups=sort(groups)
groups[c(1:6)]="control"
groups[c(7:12)]="diabetic"
groups[c(13:17)]="control"
groups[c(18:23)]="diabetic"

It gives 21 number of differentially expressed genes BH-adjusted p < 0.05 as the cut-off.

Second situation:
groups = pData(phenoData(gse55650dat[[1]]))$source_name_ch1 
groups=as.character(groups) 
groups[groups=="Control myoblasts"]="non.diabetic" 
groups[groups=="Diabetes myoblasts"]="diabetic"
groups[groups=="Control myotubes"]="non.diabetic" 
groups[groups=="Diabetes myotubes"]="diabetic"

It gives 14 number of differentially expressed genes at same p value criteria. Why this difference appears?Please help me understanding this it'll be highly appreciated.

R affy limma GEOquery • 1.7k views
ADD COMMENT
0
Entering edit mode

can you please post the groups vector before and after the renaming? i.e. for both cases you tried, I would like to see the groups vector after the as.character(groups) thing and then after the whole renaming step :)

ADD REPLY
0
Entering edit mode
groups = pData(phenoData(gse55650dat[[1]]))$source_name_ch1 
groups=as.character(groups) 
groups[groups=="Control myoblasts"]="non.diabetic" 
groups[groups=="Diabetes myoblasts"]="diabetic"
groups[groups=="Control myotubes"]="non.diabetic" 
groups[groups=="Diabetes myotubes"]="diabetic"
f = factor(groups, levels=c("non.diabetic","diabetic"))
design_gse55650 = model.matrix(~0+f)
colnames(design_gse55650) = levels(f)
cont.matrix = makeContrasts(diabetic-non.diabetic, levels=design_gse55650)

fit = lmFit(gse55650eset, design_gse55650)
fit2= contrasts.fit(fit, cont.matrix)
fit2 = eBayes(fit2)

fit2$genes$Symbol = getSYMBOL(fit2$genes$ID, "hgu133plus2") 
fit2$genes$GeneName = unlist(mget(fit$genes$ID, hgu133plus2GENENAME))  

results = topTable(fit2, adjust ="BH", number = nrow(gse55650eset))
ADD REPLY
0
Entering edit mode

Thanks but I mean can you actually print the groups vector? :)

ADD REPLY

Login before adding your answer.

Traffic: 2032 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6