Hi guys,
Sorry for posting a similar post. I have tried to perform microarray differential expression analysis on a different dataset to what I have previously posted.
I ended up with a table with probe names and p values.
However I'm not sure what gene is differentially expressed as I have no gene symbols.
Does anyone know how I might determine probe id to gene symbols?
I tried this command below to determine gene symbols in R but it came up as N/A in gene symbol column.
require(hgu133a.db)
> probes <- rownames(eset)
> annotLookup <- select(hgu133a.db, keys = probes,
columns = c('PROBEID', 'ENSEMBL', 'SYMBOL'))
head(annotLookup)
PROBEID ENSEMBL SYMBOL
1 23064070 <NA> <NA>
2 23064071 <NA> <NA>
3 23064072 <NA> <NA>
4 23064073 <NA> <NA>
5 23064074 <NA> <NA>
6 23064075 <NA> <NA>
The rest of my code is below:
library("arrayQualityMetrics")
> library(GEOquery)
> library(oligo)
> library(Biobase)
> library(affy)
> library("splitstackshape")
> library("tidyr")
> library(dplyr)
> celFiles <- list.celfiles()
> affyRaw <- read.celfiles(celFiles)
Platform design info loaded.
Reading in : GSM766537.CEL
Reading in : GSM766539.CEL
Reading in : GSM766624.CEL
Reading in : GSM766640.CEL
> eset <-oligo::rma(affyRaw)
Background correcting
Normalizing
Calculating Expression
> library(limma)
> pData(eset)
index
GSM766537.CEL 1
GSM766539.CEL 2
GSM766624.CEL 3
GSM766640.CEL 4
> Groups <- c("DDLPS", "DDLPS", "WDLPS", "WDLPS")
> design <- model.matrix(~factor(Groups))
> colnames(design) <- c("DDLPS", "DDLPSvsWDLPS")
> fit <- lmFit(eset, design)
> fit <- eBayes(fit)
> options (digits =2)
> res <- topTable (fit, number = Inf, adjust.method = "none", coef = 2)
> write.table(res, "diff_exp.txt", sep= "\t")
> require(hgu133a.db)
> probes <- rownames(eset)
> annotLookup <- select(hgu133a.db, keys = probes,
+ columns = c('PROBEID', 'ENSEMBL', 'SYMBOL'))
Thanks again!
Thanks for your advice, I still get the same table with all N/A when I use hgu133a.db.
I'm not sure if they are probe ids.
Do you know another way I can get gene symbols from these ids?
I think it's because you're using the oligo package. Why are you using this package instead of just affy and limma? Your IDs appear to be coming from the oligo package.
For the command below I get the following error message, do you know how I might correct this.
Error in getCdfInfo(object) : Could not obtain CDF environment, problems encountered: Specified environment does not contain Clariom_S_Human Library - package clariomshumancdf not installed Bioconductor - clariomshumancdf not available
Also if I wanted to examine all the probes and not only the first ten, how might I be able to do this?
I'm not quite sure about the error. I don't get that error. If you google it, you can find some answers on bioconductor about it, and some libraries with clariomshuman stuff that you might have to figure out or install. Reading the affy package docs might explain what's required.
As for looking at more than 10 probes, you'll see in the code above, I specified looking up only the first 10 with:
probes[1:10]
, so if you leave off the indexing (square brackets), you'll get them all.