microarray differential expression analyses - probe names to gene symbols
1
0
Entering edit mode
2.1 years ago

Hi guys,

Sorry for posting a similar post. I have tried to perform microarray differential expression analysis on a different dataset to what I have previously posted.

I ended up with a table with probe names and p values.

However I'm not sure what gene is differentially expressed as I have no gene symbols.

Does anyone know how I might determine probe id to gene symbols?

I tried this command below to determine gene symbols in R but it came up as N/A in gene symbol column.

require(hgu133a.db)
> probes <- rownames(eset)
> annotLookup <- select(hgu133a.db, keys = probes,
   columns = c('PROBEID', 'ENSEMBL', 'SYMBOL'))



 head(annotLookup)
   PROBEID ENSEMBL SYMBOL
1 23064070    <NA>   <NA>
2 23064071    <NA>   <NA>
3 23064072    <NA>   <NA>
4 23064073    <NA>   <NA>
5 23064074    <NA>   <NA>
6 23064075    <NA>   <NA>

The rest of my code is below:

library("arrayQualityMetrics")
> library(GEOquery)
> library(oligo)
> library(Biobase)
> library(affy)
> library("splitstackshape")
> library("tidyr")
> library(dplyr)
> celFiles <- list.celfiles()
> affyRaw <- read.celfiles(celFiles)
Platform design info loaded.
Reading in : GSM766537.CEL
Reading in : GSM766539.CEL
Reading in : GSM766624.CEL
Reading in : GSM766640.CEL
> eset <-oligo::rma(affyRaw)
Background correcting
Normalizing
Calculating Expression
> library(limma)
> pData(eset)
              index
GSM766537.CEL     1
GSM766539.CEL     2
GSM766624.CEL     3
GSM766640.CEL     4
> Groups <- c("DDLPS", "DDLPS", "WDLPS", "WDLPS")
> design <- model.matrix(~factor(Groups))
> colnames(design) <- c("DDLPS", "DDLPSvsWDLPS")
> fit <- lmFit(eset, design)
> fit <- eBayes(fit)
> options (digits =2)
> res <- topTable (fit, number = Inf, adjust.method = "none", coef = 2)
> write.table(res, "diff_exp.txt", sep= "\t")
> require(hgu133a.db)
> probes <- rownames(eset)
> annotLookup <- select(hgu133a.db, keys = probes,
+   columns = c('PROBEID', 'ENSEMBL', 'SYMBOL'))

Thanks again!

microarray symbols gene • 1.2k views
ADD COMMENT
0
Entering edit mode
2.1 years ago
seidel 11k

You should confirm that you are getting proper probe set IDs. Those IDs you have there are not affy ids. Perhaps you can try:

probes <- featureNames(eset)

The hgu133a db does have Gene Symbol mappings for some proportion of the Ensembl IDs.

probes <- c("244493_at", "221288_at", "211600_at")
select(hgu133a.db, columns = c('PROBEID', 'ENSEMBL', 'SYMBOL'), keys=probes)

'select()' returned 1:many mapping between keys and columns
    PROBEID         ENSEMBL SYMBOL
1 244493_at            <NA>   <NA>
2 221288_at ENSG00000172209  GPR22
3 221288_at ENSG00000283812  GPR22
4 211600_at ENSG00000151490  PTPRO

You can also retrieve gene/affy probe ID mappings at ensembl biomart.

Alternatively, maybe none of the probes sets coming up as differentially expressed have gene mappings (a significant fraction of human ensembl gene IDs have no hgu133a mapping).

ADD COMMENT
0
Entering edit mode

Thanks for your advice, I still get the same table with all N/A when I use hgu133a.db.

8086                    TC0200008291.hg.1    <NA>   <NA>
8087                    TC0200008300.hg.1    <NA>   <NA>
8088                    TC0200008323.hg.1    <NA>   <NA>
8089                    TC0200008334.hg.1    <NA>   <NA>
8090                    TC0200008335.hg.1    <NA>   <NA>

I'm not sure if they are probe ids.

Do you know another way I can get gene symbols from these ids?

ADD REPLY
0
Entering edit mode

I think it's because you're using the oligo package. Why are you using this package instead of just affy and limma? Your IDs appear to be coming from the oligo package.

library(affy)
library(limma)
library(hgu133a.db)

celFiles <- list.celfiles()    
affyRaw <- ReadAffy(filenames=celFiles)

# use rma() from affy package on the object
eset <-rma(affyRaw)
Groups <- c("DDLPS", "DDLPS", "WDLPS", "WDLPS")
design <- model.matrix(~factor(Groups))
colnames(design) <- c("DDLPS", "DDLPSvsWDLPS")
fit <- lmFit(eset, design)
fit <- eBayes(fit)
options (digits =2)
res <- topTable (fit, number = Inf, adjust.method = "none", coef = 2)

# examine 1st 10
select(hgu133a.db, keys = probes[1:10], columns = c('PROBEID', 'ENSEMBL', 'SYMBOL'))
'select()' returned 1:many mapping between keys and columns

PROBEID         ENSEMBL SYMBOL
1  1007_s_at ENSG00000204580   DDR1
2  1007_s_at ENSG00000223680   DDR1
3  1007_s_at ENSG00000229767   DDR1
4  1007_s_at ENSG00000234078   DDR1
5  1007_s_at ENSG00000215522   DDR1
6  1007_s_at ENSG00000230456   DDR1
7  1007_s_at ENSG00000137332   DDR1
8    1053_at ENSG00000049541   RFC2
9     117_at ENSG00000173110  HSPA6
10    121_at ENSG00000125618   PAX8
11 1255_g_at ENSG00000048545 GUCA1A
12   1294_at ENSG00000182179   UBA7
13   1316_at ENSG00000126351   THRA
14   1320_at ENSG00000070778 PTPN21
15 1405_i_at ENSG00000271503   CCL5
16 1405_i_at ENSG00000274233   CCL5
17   1431_at ENSG00000130649 CYP2E1
ADD REPLY
0
Entering edit mode

For the command below I get the following error message, do you know how I might correct this.

eset <-rma(affyRaw)

Error in getCdfInfo(object) : Could not obtain CDF environment, problems encountered: Specified environment does not contain Clariom_S_Human Library - package clariomshumancdf not installed Bioconductor - clariomshumancdf not available

Also if I wanted to examine all the probes and not only the first ten, how might I be able to do this?

ADD REPLY
0
Entering edit mode

I'm not quite sure about the error. I don't get that error. If you google it, you can find some answers on bioconductor about it, and some libraries with clariomshuman stuff that you might have to figure out or install. Reading the affy package docs might explain what's required.

As for looking at more than 10 probes, you'll see in the code above, I specified looking up only the first 10 with: probes[1:10], so if you leave off the indexing (square brackets), you'll get them all.

ADD REPLY

Login before adding your answer.

Traffic: 1573 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6