Appropriate gene IDs for enrichment analysis (with clusterProfiler)
1
0
Entering edit mode
5.4 years ago
ayatrience • 0

Hi, I am trying to do GO&KEGG enrichment analysis using R package, clusterProfiler. I changed gene IDs (ENSEMBL→uniprot) by "bitr" functions for KEGG enrichment analysis. However, "bitr" returned the multiple IDs from single gene sometimes (I changed into ENTREZ id at the same time). I should pick up one ID from multiple IDs returned from single gene, for enrichment analysis, I thought. So my question is ②How people select the appropriate IDs from multiple returns. I need to do it manually by confirming each returned IDs using uniprot website ? (Ex. judging from the annotation score) but this is so hard working. How everyone deal with this problem ?? (Or we don't need to pick up one from single gene in the first place ...?)

rna-seq R gene gene IDs clusterProfiler • 7.0k views
ADD COMMENT
0
Entering edit mode

It seems very nice to pick up one. If I want to refer the uniprot annotation score, we also can write down another script using function in "multiVal". I will follow your script and workflow. Thank you very much !!

ADD REPLY
4
Entering edit mode
5.4 years ago
Barry Digby ★ 1.3k

Follow the detailed workflow here : https://github.com/twbattaglia/RNAseq-workflow :

# Add ENTREZ ID
results$entrez <- mapIds(x = org.Mm.eg.db,
                     keys = row.names(results),
                     column = "ENTREZID",
                     keytype = "SYMBOL",
                     multiVals = "first")

For starters, don't bother using ENSEMBL to UniProt. In the guide, the user has set

multiVals = "first'

Which means: "This value means that when there are multiple matches only the 1st thing that comes back will be returned. This is the default behavior." I have seen this used quite a lot in workflows, so assumed it is ok. If you want to set it to something else, check out the MultiVals argument here: https://www.rdocumentation.org/packages/AnnotationDbi/versions/1.30.1/topics/AnnotationDb-objects

(EDIT): When you get a handle of that workflow, move to this one: https://yulab-smu.github.io/clusterProfiler-book/chapter12.html

ADD COMMENT
0
Entering edit mode

It seems very nice to pick up one. If I want to refer the uniprot annotation score, we also can write down another script using function in "multiVal". I will follow your script and workflow. Thank you very much !!

ADD REPLY
0
Entering edit mode

Hi, thank you for the answer! Do you by any chance know how to get the version of ensembl annotation from org.Mm.eg.db? This might give a potential issue when my upstream is using a newer genome build (mm39). The snapshot date doesn't tell me the version of ensembl annotation it has.


Sorry, just realized that there's an answer for this in this post.

ADD REPLY

Login before adding your answer.

Traffic: 2059 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6