Help with converting gene symbols to gene IDs
1
0
Entering edit mode
5.0 years ago
nattzy94 ▴ 60

I am trying to convert gene symbols in one column to gene IDs. I am using the following code to achieve this:

for (i in 1:nrow(data_gsea2)){   
data_gsea2$ID[i] <-  mapIds(org.Hs.eg.db, (data_gsea2$ID[i]), "ENTREZID", "SYMBOL") 
}

I have a dataframe of 7386 rows/genes and this is taking forever to complete. I'm sure there is a smarter way to do this but I'm not sure how to. Anyone can help?

Thanks very much!

R gsea • 1.8k views
ADD COMMENT
1
Entering edit mode

If you've verified this command works as you expect it to (by, say, running it on ten rows instead of the entire data.frame), you'll probably just have to wait for the process to complete.

ADD REPLY
0
Entering edit mode

Yes, it works but takes a long time.

ADD REPLY
0
Entering edit mode

Try using apply (or even better, mclapply) instead of using a loop.

ADD REPLY
1
Entering edit mode
5.0 years ago
Ahill ★ 2.0k

mapIds manual indicates it allows submission of multiple keys, as opposed to one-by-one, and using a different test case that is on hand here, that appears to be much faster:

require(AnnotationDbi)
require(hgu95av2.db)
keys <- head(keys(hgu95av2.db, 'ENTREZID'), 100)

# sapply - one key at a time, 100 mapIds() calls
system.time(sapply(keys, function(z) mapIds(hgu95av2.db, keys=z, column="ALIAS", keytype="ENTREZID")))
<snip>
   user  system elapsed 
   5.35    0.64    6.11 

# send all keys at once, 1 mapIds() call
system.time(mapIds(hgu95av2.db, keys, column="ALIAS", keytype="ENTREZID"))
<snip>
   user  system elapsed 
   0.05    0.01    0.06
ADD COMMENT
0
Entering edit mode

Good catch, Ahill! Can't believe I missed this - OP is submitting keys one by one in a loop instead of submitting a vector of keys!

ADD REPLY

Login before adding your answer.

Traffic: 2601 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6