Mismatch in probe ids
1
3
Entering edit mode
4.4 years ago
Dataminer ★ 2.8k

Dear members,

I am using "An end to end workflow for differential gene expression using Affymetrix microarrays" as described here at one of the step it requires to map probe ids to gene names and symbols.

The methods uses

anno_palmieri <- AnnotationDbi::select(hugene10sttranscriptcluster.db,
                                       keys = (featureNames(palmieri_manfiltered)),
                                       columns = c("SYMBOL", "GENENAME"),
                                       keytype = "PROBEID")

However, I get an error, saying:-

Error in .testForValidKeys(x, keys, keytype, fks) : 
  None of the keys entered are valid keys for 'PROBEID'. Please use the keys method to see a listing of valid arguments.

On some digging I found that:-

my keys look like this:

head(featureNames(palmieri_manfiltered))

"1007_s_at" "1053_at" "117_at" "121_at" "1255_g_at" "1294_at"

whereas if I look at the key type probe id from hugene10sttranscriptcluster.db using

head(keys(hugene10sttranscriptcluster.db, keytype = "PROBEID"))

it looks like this

[1] "7892501" "7892502" "7892503" "7892504" "7892505" "7892506"
  

Can anyone help me in fixing this issue or put me in right direction. Thank you

Affymetrix Microarray • 2.0k views
ADD COMMENT
1
Entering edit mode

It looks to me those identifers belongs to diferent Affimetrix probe ids. I don't know how you can solve with hugene10sttranscriptcluster.db package but alternatively you can try the biobtreeR package for these mappings. Following example query, maps probe ids you mentioned to the ensembl gene identifer and gene symbols

bbMapping("1007_s_at,1053_at",source =affy_hg_u133_plus_2 ,'map(transcript).map(ensembl)',attrs = name)

ADD REPLY
3
Entering edit mode
4.4 years ago

Which array are you using? Your probe IDs are not from the Human Gene 1.0 ST Array - they seem to be from the HG-U95. So, you could try either of these DBs instead:

hgu95av2.db
hgu95a.db

Kevin

ADD COMMENT
0
Entering edit mode

Hi, I'm having exactly the same problem as the original poster.

But my Ids are for example "41220_PM_at", "41329_PM_at" , "41386_PM_i_at", "41387_PM_r_at" ...

The point is, how to find out whether the correct DB is hugene10sttranscriptcluster, hgu95a, or other any other? I'm trying to do this automatically for several different microarrays, could I get it from the .CEL files?

ADD REPLY
0
Entering edit mode

You can read the CEL header with:

affyio::read.celfile.header()
ADD REPLY

Login before adding your answer.

Traffic: 1980 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6