Convertion from Uniprot ID to Gene Symbol does not work well.

0

Entering edit mode

5.3 years ago

entropy ▴ 50

This seems to be a basic process but somehow I could not find the best answer yet.

I am trying to convert Uniprot IDs into Gene Symbols. I run this code but about 1/4 of my list returns NA, Is there a better way to get the full conversion right?

uniprots[1:5] . # "A0QVH7" "A0R666" "A0SYQ0" "A3RGC1" "A5A4K8"

length(uniprots)  # 64102

z <- select(org.Hs.eg.db, uniprots, "SYMBOL", "UNIPROT")

dim(z) # 64320     2

length( which( is.na( z$SYMBOL ) ) ) # 15789

genome sequencing RNA-Seq • 3.2k views

ADD COMMENT • link updated 5.3 years ago by ATpoint 86k • written 5.3 years ago by entropy ▴ 50

2

Entering edit mode

Verify ones that are not converting using UniProt's ID mapping tool (from UniProt ID to Gene name). They may either not be human genes or deprecated in recent releases.

ADD REPLY • link 5.3 years ago by GenoMax 148k

0

Entering edit mode

Thanks. It looks there exist some. I just tried and got this:"1,370 out of 1,433 identifiers from UniProtKB AC/ID were successfully mapped to 1,092 Gene name IDs.".

For example I got this:

A7MBE0  SLC22A1

O08537  Esr2

and so on...

Actually, I uploaded 15K IDs but received only that much result, not sure why. I selected FROM: "UniProtKB AC/ID" , TO: "GENE NAME" in the drop down menu.

ADD REPLY • link updated 5.3 years ago by GenoMax 148k • written 5.3 years ago by entropy ▴ 50

2

Entering edit mode

O08537 Esr2

That is a mouse protein.

A7MBE0 SLC22A1

This is a cow protein.

You are using human database/library in your R code above. So it predictably is not able to find these ID's.

ADD REPLY • link 5.3 years ago by GenoMax 148k

0

Entering edit mode

Thanks. I think that answers my question. Is there a way to scan all at once instead of looping per database?

ADD REPLY • link 5.3 years ago by entropy ▴ 50

2

Entering edit mode

I think you are asking if this can be done via R. Perhaps someone else would suggest an appropriate package.

You could also do this by programmatically accessing UniProt's site.

ADD REPLY • link 5.3 years ago by GenoMax 148k

Login before adding your answer.