Question

Map gene names to gene ids.

0

Entering edit mode

10.3 years ago

kandoigaurav ▴ 150

I've the some gene names (eg: 11-cis-retinol dehydrogenase, D-2-hydroxyacid dehydrogenase (NAD+), 3alpha-hydroxysteroid 3-dehydrogenase) and their corresponding EC number. I want to map these gene names to other gene ids like EntrezGene ID, Ensembl ID etc.

The number of entries are >1k, so I can't do manual annotation for these.

Can someone suggest a way to map these names to ids?

Mapping Gene Proteins Nomenclature ID conversion • 6.2k views

ADD COMMENT • link updated 2.9 years ago by Ram 44k • written 10.3 years ago by kandoigaurav ▴ 150

Ram · Answer 1 · 2014-08-29

0

Entering edit mode

10.3 years ago

komal.rathi ★ 4.1k

Have you ever used Biomart? These genes belong to which organism? Which fields are you looking for exactly?

ADD COMMENT • link 10.3 years ago by komal.rathi ★ 4.1k

0

Entering edit mode

Yep. It doesn't limit the query to Gene names.

ADD REPLY • link 10.3 years ago by kandoigaurav ▴ 150

0

Entering edit mode

It does. For e.g. HGNC Symbol or WikiGene Name. However, like Devon Ryan suggested, using the correct AnnotationDbi package would be more appropriate & will give more "accurate" results.

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.3 years ago by komal.rathi ★ 4.1k

0

Entering edit mode

I want to convert the gene names and not the symbols.

ADD REPLY • link 10.3 years ago by kandoigaurav ▴ 150

0

Entering edit mode

Got it! Use Bioconductor. It will give you Gene Symbols, Entrez ID, Ensembl Gene ID etc for your Gene Names.

ADD REPLY • link updated 2.9 years ago by Ram 44k • written 10.3 years ago by komal.rathi ★ 4.1k

0

Entering edit mode

Okay. Sounds cool. Lemme try. Thanks

ADD REPLY • link 10.3 years ago by kandoigaurav ▴ 150

0

Entering edit mode

Umm, can you tell me how to implement the package?

ADD REPLY • link 10.3 years ago by kandoigaurav ▴ 150

1

Entering edit mode

The general idea is to make a character vector of gene names that you want to look up and then do something like select(org.Mm.eg.db, keys=genes, columns=c("SYMBOL","ENTREZID","ENSEMBL"), keytype="GENENAME") will look for the gene symbol, entrez ID, and Ensembl ID associated with each gene name in the genes vector. Note that this isn't a bullet-proof method. For example, it won't find any of your examples because it's expecting other names. "11-cis-retinol dehydrogenase" is also called "retinol dehydrogenase 5", for example, and that'll be found. All of these values from from entrez, so there aren't mappings to every possible name.

If this doesn't work, I'd try something from this thread: Gene Id Conversion Tool

ADD REPLY • link 10.3 years ago by Devon Ryan 104k

0

Entering edit mode

You can also try the EC IDs, which are called ENZYME with AnnotationDbi. That might end up working a bit better.

ADD REPLY • link 10.3 years ago by Devon Ryan 104k

score 0 · Answer 2 · 2014-08-29

0

Entering edit mode

10.3 years ago

Devon Ryan 104k

Have you tried the appropriate AnnotationDbi package in Bioconductor (e.g., org.Mm.eg.db or org.Hs.eg.db)?

ADD COMMENT • link 10.3 years ago by Devon Ryan 104k

0

Entering edit mode

No. Does it converts the gene name to any other ID? I'vent used Bioconductor yet.

ADD REPLY • link 10.3 years ago by kandoigaurav ▴ 150