I used the bitr function from the clusterProfiler package to convert gene symbols from a DE experiment to UniProt protein ids. For some unique gene symbols, there are multiple UniProt ids.
Surely each gene id should map to a single protein and each protein has a unique id. So is my code correct and does it matter that there are multiple UniProt ids for a single gene?
My code is
Genes <- c("AACS", "ACAA2", "ACADM", "ACLY", "ACOT8")
Protein_IDs <- bitr(Genes, fromType="SYMBOL", toType="UNIPROT", OrgDb="org.Hs.eg.db") # returns 15 rows
test <- distinct(Protein_IDs, UNIPROT, .keep_all = TRUE) # returns 15 rows
SYMBOL UNIPROT
AACS Q86V21
AACS A0A024RBV2
ACAA2 B3KNP8
ACAA2 P42765
ACADM A0A0S2Z366
ACADM P11310
ACADM B7Z9I1
ACADM Q5HYG7
ACADM Q5T4U5
ACADM B4DJE7
ACLY A0A024R1T9
ACLY P53396
ACLY Q4LE36
ACLY A0A024R1Y2
ACOT8 O14734
This seems to use a really lenient mapping with unreviewed entries etc. You may have better luck using biomaRt.
HUGO entry for
ACADM
indeed lists only one UniProt accession.You can download an official list of human gene symbols and their corresponding UniProt ID's from HUGO site using a custom download. Select things you want in output.
Apologies. I actually wanted to report a solution I found using the "mygene" package. I hit the delete button in error.
Edit:
see below for an update and further query
Please edit your answer and add some code so people facing similar problems will have a starting point for their solutions.