Error in converting the probsets
1
0
Entering edit mode
3.8 years ago
zizigolu ★ 4.3k

Hello

I have an expression set from Rat GSE43700

I am trying to annotate the probs but I get this error

> columns(hgu133plus2.db) # Features retrievable by AnnotationDbi::select
 [1] "ACCNUM"       "ALIAS"        "ENSEMBL"      "ENSEMBLPROT"  "ENSEMBLTRANS"
 [6] "ENTREZID"     "ENZYME"       "EVIDENCE"     "EVIDENCEALL"  "GENENAME"    
[11] "GO"           "GOALL"        "IPI"          "MAP"          "OMIM"        
[16] "ONTOLOGY"     "ONTOLOGYALL"  "PATH"         "PFAM"         "PMID"        
[21] "PROBEID"      "PROSITE"      "REFSEQ"       "SYMBOL"       "UCSCKG"      
[26] "UNIGENE"      "UNIPROT"     
> anno_filt_eset2 <- AnnotationDbi::select(hgu133plus2.db, keys = (featureNames(filt_eset2)), columns = c("SYMBOL", "GENENAME", "ENTREZID"), keytype = "PROBEID")
Error in .testForValidKeys(x, keys, keytype, fks) : 
  None of the keys entered are valid keys for 'PROBEID'. Please use the keys method to see a listing of valid arguments.
>

Any help please?

affy array • 2.0k views
ADD COMMENT
0
Entering edit mode

See the answer of James on Bioconductor:

ADD REPLY
4
Entering edit mode
3.8 years ago

Hello again. Can you let me know what is the output of:

featureNames(filt_eset2)

?

ADD COMMENT
0
Entering edit mode

Thank you

> featureNames(filt_eset2)
   [1] "1399067_at"   "1388081_at"   "1375569_at"   "1389201_at"   "1371559_at"
ADD REPLY
0
Entering edit mode

Thank you. You said that it is Rat?; however, the GEO record indicates that it is human. Can you show all of your initial data processing steps?

ADD REPLY
0
Entering edit mode

Thank you

Yes this is Rat and as this is a public data this is my R object

https://www.dropbox.com/s/k338ooac2dpifg6/GSEA_Broad.R?dl=0

And full R script

https://www.dropbox.com/s/qxd8pan0jpuoq1m/Preranked_fGSEA%20%281%29.R?dl=0

And this is first lines of what I run before the error

gse <- getGEO("GSE43700", GSEMatrix = T, AnnotGPL = T)

show(gse)

head(exprs(gse[[1]]))[,1:5]

pdata <- as.data.frame(pData(gse[[1]]), stringsAsFactors = F)

all(colnames(exprs(gse[[1]]))==pdata$geo_accession)

plotMDS(exprs(gse[[1]]), labels = pdata$title)
plotMDS(exprs(gse[[1]]), labels = pdata$`donor_id:ch1`)



## Assessing rawdata from GEO using getGEOSuppFiles()

getGEOSuppFiles("GSE43700", makeDirectory = F)

list.files()

untar("GSE43700_RAW.tar", exdir = "rawdata")


list.files()

rawdata <- ReadAffy() # Import files

show(rawdata)

rma <- rma(rawdata, normalize = F, background = F) # Skips the normalization and background correction to illustrate their requirement

boxplot(exprs(rma), las=2)

dim(exprs(rawdata))
dim(exprs(rma)) #  rma function combines the individual probe intensities to a probeset intensity

rma <- rma(rawdata, normalize = T, background = T) # Normalization and background correction

boxplot(exprs(rma), las=2)

filt_eset2 <- featureFilter(rma, require.entrez = T, remove.dupEntrez = T)

dim(rma)
dim(filt_eset2)

## Add annotation information to the eSet featureData slot

columns(hgu133plus2.db) # Features retrievable by AnnotationDbi::select

anno_filt_eset2 <- AnnotationDbi::select(hgu133plus2.db, keys = (featureNames(filt_eset2)), columns = c("SYMBOL", "GENENAME", "ENTREZID"), keytype = "PROBEID")
ADD REPLY
0
Entering edit mode

No, GSE43700 is definitely human data, not rat. Please check the GEO record: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE43700

The following lines are a problem:

untar("GSE43700_RAW.tar", exdir = "rawdata")

Here, you decompress to a directory called 'rawdata'

rawdata <- ReadAffy() # Import files

Here, you are reading in files from the current working directory, which, on your computer, seems to contain Rat data (probably from some other study that was indeed Rattus norvegicus).

What you need is this:

rawdata <- ReadAffy(filenames = list.files('rawdata/', full.names = TRUE))
ADD REPLY
0
Entering edit mode

Sorry

You are right

This is the right one GSE2457

I put CELL files from GSE2457 in rawdata directly and I run the code again but I got the same error

> anno_filt_eset2 <- AnnotationDbi::select(hgu133plus2.db, keys = (featureNames(filt_eset2)), columns = c("SYMBOL", "GENENAME", "ENTREZID"), keytype = "PROBEID")
Error in .testForValidKeys(x, keys, keytype, fks) : 
  None of the keys entered are valid keys for 'PROBEID'. Please use the keys method to see a listing of valid arguments.
>
ADD REPLY
0
Entering edit mode

Excellent, so, you need to use rae230a.db in place of hgu133plus2.db

  • hgu133plus2.db is for the Affymetrix Human Genome U133 Plus 2.0 Array (GSE43700)
  • rae230a.db is for the Affymetrix Rat Expression 230A Array (GSE2457)
ADD REPLY

Login before adding your answer.

Traffic: 1995 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6