Question

Convert gene name to uniprot ID

2

Entering edit mode

11.1 years ago

diablo82.26 ▴ 20

I have a list of gene name in a file

CHRNB2
EGR2
GCK
KRT14
LMNA
FGF3
TK2
ABCC8

How can U map them to uniprot ID?

P.S I tried Uniprot "ID mapping" (from-"GENEID" to-"UNIPROTKB AC").But it couldn't map.

Please suggest me what to do.Thnx

gene SNP uniprot • 33k views

ADD COMMENT • link updated 3.7 years ago by Ram 45k • written 11.1 years ago by diablo82.26 ▴ 20

1

Entering edit mode

11.1 years ago

aheinzel ▴ 130

Your entries remind me of gene symbols. a couple of options are available, just to name a few:

David gene id conversion tool (choose official_gene_symbol during upload)
Ensembl biomart (use filters -> id list limit choose hgnc symbols to restrict the result to your genes of interest; select symbol and UniProt/TrEMBL Accession from the attributes section to get a mapping file
IdMapper ExcelAddIn (convert first from GeneSymbol to ENSG and from their further to UniprotID)

ADD COMMENT • link updated 3.7 years ago by Ram 45k • written 11.1 years ago by aheinzel ▴ 130

1

Entering edit mode

11.1 years ago

Prakki Rama ★ 2.7k

Adding to the above list, you can also try Biodbnet

ADD COMMENT • link updated 3.7 years ago by Ram 45k • written 11.1 years ago by Prakki Rama ★ 2.7k

1

Entering edit mode

11.1 years ago

cdsouthan ★ 1.9k

For the HGNC symbols its useful to go throught the Symbol Checker first

You can then do some cross-checks (e.g. see if the names are what you expect and if any symbols are outdated)

Then paste the HGNC: ID list column across to UniProt ID mapper

You can then filter by species and reviewed (= Swiss-Prot)

(would be interesting if you ran a few thousand symbols through all the methods above and tell us how you got on!)

ADD COMMENT • link updated 3.7 years ago by Ram 45k • written 11.1 years ago by cdsouthan ★ 1.9k

0

Entering edit mode

A note of caution, related to this question, from the results for the sequential cross-reference restricts of Human Swiss-Prot, HGNC, Entrez Gene and Ensembl. It thus looks like http://www.ncbi.nlm.nih.gov/pubmed/24939910 suggesting sub-19K numbers could be not far short of the mark, or at least nearer than the (conservative back then) 25K estimate a decade ago http://www.ncbi.nlm.nih.gov/pubmed/15174140

http://www.uniprot.org/uniprot/?query=%28organism%3A%22Homo+sapiens+[9606]%22%29+AND+reviewed%3Ayes&sort=score = 20,213

http://www.uniprot.org/uniprot/?query=%28organism%3A%22Homo+sapiens+[9606]%22%29+AND+reviewed%3Ayes+AND+database%3A%28type%3Ahgnc%29&;sort=score = 19,760

http://www.uniprot.org/uniprot/?query=%28organism%3A%22Homo+sapiens+[9606]%22%29+AND+reviewed%3Ayes+AND+database%3A%28type%3Ahgnc%29+AND+database%3A%28type%3Ageneid%29&;sort=score = 18,768

http://www.uniprot.org/uniprot/?query=%28organism%3A%22Homo+sapiens+[9606]%22%29+AND+reviewed%3Ayes+AND+database%3A%28type%3Ahgnc%29+AND+database%3A%28type%3Ageneid%29+AND+database%3A%28type%3Aensembl%29&;sort=score = 18,550

ADD REPLY • link updated 3.7 years ago by Ram 45k • written 11.1 years ago by cdsouthan ★ 1.9k

1

Entering edit mode

11.1 years ago

Elisabeth Gasteiger ★ 2.4k

You should be able to obtain what you are looking for by following the instructions in this UniProt FAQ:

"Can I convert gene symbols to UniProtKB identifiers? How can I map UniProtKB IDs or ACs to gene symbols?"

Please note that it is planned to extend the UniProt identifier mapping to gene symbols.

Don't hesitate to contact the UniProt helpdesk if you have additional questions.

ADD COMMENT • link updated 3.7 years ago by Ram 45k • written 11.1 years ago by Elisabeth Gasteiger ★ 2.4k

Ram · Accepted Answer · 2014-07-03

6

Entering edit mode

11.1 years ago

networkbiothings ▴ 60

Use Mygene.info. You can do batch requests via post, or you can use the live API to do batch requests as well.

Here's how via the live API:

Click on the "Try API live!", select "gene query service". Click on "post"

For "q" put in your gene names separated by a comma.

For "scopes" type "symbol" (without the quotation marks)

For "fields", use "symbol,entrezgene,uniprot" and any other parameter of interest

Click "try it" when done.

Result will be in the response body.

ADD COMMENT • link updated 3.7 years ago by Ram 45k • written 11.1 years ago by networkbiothings ▴ 60

3

Entering edit mode

There is also a "mygene" python module available to use.

Usage examples on this id mapping tutorial are available here

ADD REPLY • link updated 3.7 years ago by Ram 45k • written 11.1 years ago by Newgene ▴ 370

0

Entering edit mode

Very helpful. Thanks

ADD REPLY • link updated 3.7 years ago by Ram 45k • written 11.1 years ago by diablo82.26 ▴ 20

Ram · Accepted Answer · 2014-07-03

These look to be gene symbols not Entrez Gene identifiers, so identifier mapping for Entrez Gene would not work. However UniProt include most gene symbols and their various synonyms in their data, so a query should work and find the set of UniProtKB entries which match. For example:

Go to the UniProt.org website
Select "Protein Knowledgebase (UniProtKB)" for "Search in"
For the "Query" enter 'gene:' followed by the gene symbol (e.g. "gene:CHRNB2")
Click the "Search" button
The results contain a list of UniProtKB entries matching this gene symbol (e.g. for CHRNB2)

I suspect you may have a specific species in mind so you might want to use additional terms to limit the results further.

Once you have worked out the form of the required query you could use the UniProt.org REST API to script the required queries and return only selected data.