microarray bioinformatic analyses
1
0
Entering edit mode
2.0 years ago

Hi,

I'm trying to perform microarray differential expression analyses in R using this NCBI dataset: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE159659.

These are the commands I have used so far:

 library(GEOquery)
    library(limma)
    library(preprocessCore)


options("download.file.method.GEOquery" = "wget")

obtain GE data ----------------------------------------------------------

load series (expression matrix) and platform data from GEO

gset <- getGEO("GSE159659", GSEMatrix = TRUE, getGPL= TRUE)
length(gset)
gset <- gset[[1]]

subset data for conditions of interest ----------------------------------

gs0 <- pData(gset)$`subtype:ch1`   
table(gs0)

However for this command table(gs0) it returns the following result:

< table of extent 0 >

It should list the different tissue types (adipose, well differentiated liposarcoma, dedifferentiated liposarcoma) and the quantity of samples belonging to each tissue type.

Does anyone know where I have gone wrong?

Thanks!

microarray • 1.4k views
ADD COMMENT
0
Entering edit mode
2.0 years ago
Basti ★ 2.0k

There is no column subtype:ch1 in your dataset

You need

gs0 <- pData(gset)$`tissue:ch1`   
table(gs0)
gs0
                 adipose tissue    dedifferentiated liposarcoma well differentiated liposarcoma 
                             15                              15                              15 
ADD COMMENT
0
Entering edit mode

Thanks, it worked. I just had another question.

keep <- c("dedifferentiated liposarcoma", "well differentiated liposarcoma")
sample_idx <- which(gs0 %in% keep)

gset <- gset[, sample_idx]

gs <- pData(gset)$`tissue:ch1`
table(gs)

differential expression analysis ----------------------------------------

assign samples to groups and set up design matrix

gs <- factor(ifelse(gs == "well differentiated liposarcoma", "WDLPS", "DDLPS"))
gset$group <- gs

design <- model.matrix(~group, gset)

For this command design <- model.matrix(~group, gset) I get the following error message: Error in contrasts<-(`tmp`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels

Do you know I might correct this?

ADD REPLY
0
Entering edit mode

I do not have this error with your code

ADD REPLY
0
Entering edit mode

Apologises I have now retried the code and it worked.

Finally, I tried to run this command below to generate a table of p values for differential expression. However I cannot locate the csv file in the working directory.

Do you know where it might be?

 write.csv(tT, "output/DEA_resul.csv", row.names = FALSE)

The code before this command:

# fit linear model
fit <- lmFit(gset, design)  

# compute statistics
fit2 <- eBayes(fit, 0.01)

# table of DEA results
tT <- topTable(fit2, adjust = "fdr", sort.by = "P", number= Inf)
ADD REPLY
0
Entering edit mode

Check current directory with getwd()

ADD REPLY
0
Entering edit mode

I have checked the desktop and the file is not there. I also get the following error message for write.csv(tT, "output/DEA_resul.csv", row.names = FALSE)

> getwd()
[1] "/Users/mesalie/Desktop"
> write.csv(tT, "output/DEA_resul.csv", row.names = FALSE)

Error in file(file, ifelse(append, "a", "w")) : cannot open the connection In addition: Warning message: In file(file, ifelse(append, "a", "w")) : cannot open file 'output/DEA_resul.csv': No such file or directory

ADD REPLY
0
Entering edit mode

Did you create the folder output ?

ADD REPLY
0
Entering edit mode

No I didn't, thankyou.

Also, after running quality control I received an error message for design <- model.matrix(~group, gset)

pdf("output/1.QC_boxplot.pdf", width = 12, height = 5)
par(mar=c(7,4,2,1))
boxplot(exprs(gset), boxwex = 0.7, notch = TRUE, outline = FALSE, las = 2)
dev.off()

# expression value distribution
pdf("output/2.QC_density_plot.pdf", width = 6, height = 6)
par(mar=c(4,4,2,1))
title <- paste ("GSE159659", "/", annotation(gset), " value distribution", sep ="")
plotDensities(exprs(gset), group = gs, main = title, legend ="topright")
dev.off()

gs <- factor(ifelse(gs == "well differentiated liposarcoma", "WDLPS", "DDLPS"))
gset$group <- gs
design <- model.matrix(~group, gset)

Error in contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels

Do you know how I might correct this?

ADD REPLY
0
Entering edit mode

You changed "well differentiated" to "well-differentiated" which is not present in your dataset. Please check carefully your code before asking such basic question

ADD REPLY

Login before adding your answer.

Traffic: 1633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6