Question

Mismatch in probe ids

3

Entering edit mode

4.9 years ago

Dataminer ★ 2.8k

Dear members,

I am using "An end to end workflow for differential gene expression using Affymetrix microarrays" as described here at one of the step it requires to map probe ids to gene names and symbols.

The methods uses

anno_palmieri <- AnnotationDbi::select(hugene10sttranscriptcluster.db,
                                       keys = (featureNames(palmieri_manfiltered)),
                                       columns = c("SYMBOL", "GENENAME"),
                                       keytype = "PROBEID")

However, I get an error, saying:-

Error in .testForValidKeys(x, keys, keytype, fks) : 
  None of the keys entered are valid keys for 'PROBEID'. Please use the keys method to see a listing of valid arguments.

On some digging I found that:-

my keys look like this:

head(featureNames(palmieri_manfiltered))

"1007_s_at" "1053_at" "117_at" "121_at" "1255_g_at" "1294_at"

whereas if I look at the key type probe id from hugene10sttranscriptcluster.db using

head(keys(hugene10sttranscriptcluster.db, keytype = "PROBEID"))

it looks like this

[1] "7892501" "7892502" "7892503" "7892504" "7892505" "7892506"

Can anyone help me in fixing this issue or put me in right direction. Thank you

Affymetrix Microarray • 2.2k views

ADD COMMENT • link updated 4.9 years ago by Kevin Blighe 89k • written 4.9 years ago by Dataminer ★ 2.8k

1

Entering edit mode

It looks to me those identifers belongs to diferent Affimetrix probe ids. I don't know how you can solve with hugene10sttranscriptcluster.db package but alternatively you can try the biobtreeR package for these mappings. Following example query, maps probe ids you mentioned to the ensembl gene identifer and gene symbols

bbMapping("1007_s_at,1053_at",source =affy_hg_u133_plus_2 ,'map(transcript).map(ensembl)',attrs = name)

ADD REPLY • link 4.9 years ago by tamerg ▴ 100

score 3 · Accepted Answer · 2020-06-19

3

Entering edit mode

4.9 years ago

Kevin Blighe 89k

Which array are you using? Your probe IDs are not from the Human Gene 1.0 ST Array - they seem to be from the HG-U95. So, you could try either of these DBs instead:

hgu95av2.db
hgu95a.db

Kevin

ADD COMMENT • link 4.9 years ago by Kevin Blighe 89k

0

Entering edit mode

Hi, I'm having exactly the same problem as the original poster.

But my Ids are for example "41220_PM_at", "41329_PM_at" , "41386_PM_i_at", "41387_PM_r_at" ...

The point is, how to find out whether the correct DB is hugene10sttranscriptcluster, hgu95a, or other any other? I'm trying to do this automatically for several different microarrays, could I get it from the .CEL files?

ADD REPLY • link 4.9 years ago by diogo.pellegrina ▴ 10

0

Entering edit mode

You can read the CEL header with:

affyio::read.celfile.header()

ADD REPLY • link 4.9 years ago by Kevin Blighe 89k