Question

Problems Converting Gene Names To Numeric Values In R

1

Entering edit mode

10.8 years ago

shaikhfarahdeeba ▴ 20

Hi i am carrying out differential gene expression analysis using limma further i need to do gene set enrichment analysis using GOstats but thers a problem. These are my set of differential expressed genes

 [1] "1557994_at"       "205933_at"        "1559688_at"      
 [4] "232837_at"        "212253_x_at"      "212845_at"       
 [7] "233520_s_at"      "236931_at"        "205054_at"       
[10] "237981_at"        "209896_s_at"      "221718_s_at"     
[13] "226648_at"        "208195_at"        "211928_at"

but when I convert the character vector to numeric I get a warning that NA's introduced as coercion and getting result somewhat this way :

 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[26] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[51] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[76] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

how do I solve this problem. And when i carry out analysis its taking hours and no output .

• 4.0k views

ADD COMMENT • link updated 10.8 years ago by Devon Ryan 104k • written 10.8 years ago by shaikhfarahdeeba ▴ 20

2

Entering edit mode

Are you literally just as.numeric(d) on a character vector d (just as an example)? That will always produce an NA since there's no obvious conversion between probe IDs like that and numbers. You can as.numeric(c("1","2","100")) since those are just character representations of numbers, but you have probe IDs.

ADD REPLY • link 10.8 years ago by Devon Ryan 104k

0

Entering edit mode

is it necessary to convert them into numeric vecctor

ADD REPLY • link 10.8 years ago by shaikhfarahdeeba ▴ 20

1

Entering edit mode

Have you read the GOstats documentation (PDF) ? Nowhere does it mention conversion of probeset IDs to a numeric value. Perhaps what you want to do is convert to Entrez Gene ID?

ADD REPLY • link 10.8 years ago by Neilfws 49k

0

Entering edit mode

how am i supposed to move ahead i am trying dis from past 10 days but couldnt get the result

ADD REPLY • link 10.8 years ago by shaikhfarahdeeba ▴ 20

0

Entering edit mode

i have generated top 500 genes and saved their rownames in vector rn as

rn<-rownames(toptable(fit,coef=2,n=500)) rn rn<as.numeric(rn) dat.s<-eset.new[rn,]="" i="" created="" an="" object="" dat.s="" to="" store="" the="" differentially="" exprsd="" genes.="" but="" i="" m="" getng="" nly="" na's<="" p="">

ADD REPLY • link 10.8 years ago by shaikhfarahdeeba ▴ 20

score 2 · Answer 1 · 2014-02-19

2

Entering edit mode

10.8 years ago

Devon Ryan 104k

There are annotation packages for most arrays you'll ever use in R. You'll find that easier than trying to roll your own solution.

>library("hgu133plus2.db")
>d
 [1] "1557994_at"  "205933_at"   "1559688_at"  "232837_at"   "212253_x_at"
 [6] "212845_at"   "233520_s_at" "236931_at"   "205054_at"   "237981_at"  
[11] "209896_s_at" "221718_s_at" "226648_at"   "208195_at"   "211928_at"  
>select(hgu133plus2.db, d, "SYMBOL", "PROBEID")
       PROBEID  SYMBOL
1   1557994_at     TTN
2    205933_at  SETBP1
3   1559688_at   GRAPL
4    232837_at  KIF13A
5  212253_x_at     DST
6    212845_at  SAMD4A
7  233520_s_at   CMYA5
8    236931_at    <NA>
9    205054_at     NEB
10   237981_at   CMYA5
11 209896_s_at  PTPN11
12 221718_s_at  AKAP13
13   226648_at  HIF1AN
14   208195_at     TTN
15   211928_at DYNC1H1

ADD COMMENT • link 10.8 years ago by Devon Ryan 104k

0

Entering edit mode

Thnx ryan but this vl nly give me the symbols i have to the hypergeometric test to using GOstats.Plz if u could help on this.

ADD REPLY • link 10.8 years ago by shaikhfarahdeeba ▴ 20

0

Entering edit mode

That's just an example. It looks like GOtats is expecting an EntrezID, so just use ENTREZID instead of SYMBOL. You could even directly get the associated GO terms if you wanted (you'd have to roll your own test function then, most likely) by instead using GO.

As an aside, you have a full keyboard on your computer. There's no need to use things like "Plz" or "u" or "dis".

ADD REPLY • link 10.8 years ago by Devon Ryan 104k