CluserProfiler message "No gene can be mapped"
5
2
Entering edit mode
6.7 years ago
ARich ▴ 130

Hi Biostar users,

I am working with clusterprofiler enrichKegg function

KEGG_all = enrichKEGG(regulated.gene$entrez, organism="human")

I have used library(org.Hs.eg.db) to convert gene names(Symbols) to Entrez ID which is the possible input to use this function.

However I am seeing strange message and I am not sure what could be the reason?

--> No gene can be mapped....
--> Expected input gene ID: 7364,127,574537,5538,4351,221
--> return NULL...

Does this means the IDs I am providing have no pathways associated? In that case its not correct because entrez ids I am using have pathways associated except few which don't have are listed as NA.

names=c("57801","2152","54873","7412","148867","1435","90874","NA","2702","NA","1520,","3371","7185","26468","286336","NA","22829")

Due to this I am unable to understand why this message is displayed?

Looking forward for some help.

R • 42k views
ADD COMMENT
2
Entering edit mode

Are you sure those entrez ID's are for human genes? I checked a few and none seemed to be human proteins.

ADD REPLY
1
Entering edit mode

I am quite sure. The reason is because I cant ever find these IDs in my file. --> No gene can be mapped.... --> Expected input gene ID: 7364,127,574537,5538,4351,221 --> return NULL... I used standard conversion from of Symbols to Entrez ids. res$symbol = mapIds(org.Hs.eg.db, keys=row.names(res), column="SYMBOL", keytype="ENSEMBL", multiVals="first")

res$entrez = mapIds(org.Hs.eg.db, keys=row.names(res), column="ENTREZID", keytype="ENSEMBL", multiVals="first"

res$name = mapIds(org.Hs.eg.db, keys=row.names(res), column="GENENAME", keytype="ENSEMBL", multiVals="first")

Which one you checked? the one where its No gene can be mapped?

Thanks

ADD REPLY
0
Entering edit mode

https://www.ncbi.nlm.nih.gov/protein/7364
https://www.ncbi.nlm.nih.gov/protein/5538

Other ID's in the list above seem to be nucleotide ID's.

ADD REPLY
0
Entering edit mode

These genes are from database, not your input gene list.

> gene <- c("7364", "127", "574537", "5538", "4351", "221")
> bitr(gene, "ENTREZID", "SYMBOL", OrgDb = org.Hs.eg.db)
'select()' returned 1:1 mapping between keys and columns
  ENTREZID  SYMBOL
1     7364  UGT2B7
2      127    ADH4
3   574537  UGT2A2
4     5538    PPT1
5     4351     MPI
6      221 ALDH3B1
ADD REPLY
0
Entering edit mode

Hello,

I am also using same function enrichKEGG. Can you please explain how problem was resolved for error

>kegg_enrich <-enrichKEGG(gene = names(log_counts, org = 'uma')
"No gene can be mapped....
--> Expected input gene ID: UMAG_02115,UMAG_00118,UMAG_03692,UMAG_11744,UMAG_02508,UMAG_06105
--> return NULL
ADD REPLY
0
Entering edit mode

I'm also getting the same error when trying to use the enrichMKEGG function. The same list of ENTREZ IDs work for enrichKEGG, but not enrichMKEGG. I've tried changing the limits of GSSize but still no solution.

> head(geneset_d1$ENTREZID)
[1] "12842"  "13032"  "20715"  "17329"  "12309"  "140474"


   > oraKEGG1 <- enrichKEGG(gene = geneset_d1$ENTREZID, 
+                         organism = "mmu",
+                         pvalueCutoff = 0.05,
+                         qvalueCutoff = 0.25,
+                         pAdjustMethod = "BH",
+                         universe = NULL,
+                         minGSSize = 10,
+                         maxGSSize = 500)
> dim(oraKEGG1)
[1] 19  9



  > ora_mKEGG1 <- enrichMKEGG(gene = geneset_d1$ENTREZID, 
+                         organism = "mmu",
+                         pvalueCutoff = 0.05,
+                         qvalueCutoff = 0.25,
+                         pAdjustMethod = "BH",
+                         universe = NULL,
+                         minGSSize = 1,
+                         maxGSSize = 2000)
--> No gene can be mapped....
--> Expected input gene ID: 14433,67834,14751,14380,230163,17448
--> return NULL...

Can anyone help with this?

ADD REPLY
0
Entering edit mode

Because None of these genes are included in KEGG module database. You can check the gene_list using bitr_kegg :

> gene <- c("12842", "13032", "20715", "17329", "12309", "140474")
> bitr_kegg(gene, fromType = "kegg", toType = "Module", organism = "mmu")
Reading KEGG annotation online:

trying URL 'http://rest.kegg.jp/link/mmu/module'
downloaded 29 KB

Reading KEGG annotation online:

trying URL 'http://rest.kegg.jp/list/module'
downloaded 24 KB

[1] kegg   Module
<0 rows> (or 0-length row.names)
Warning message:
In bitr_kegg(gene, fromType = "kegg", toType = "Module", organism = "mmu") :
  100% of input gene IDs are fail to map...
ADD REPLY
0
Entering edit mode

Thanks your your reply! I checked my genelists and for all of them 100% of input genes failed to map. Would this be expected or is there potentially a problem with the gene ids that I have? For example, one genelist has 166 genes - how likely is it that none of these were in the KEGG module database? Many thanks!

ADD REPLY
0
Entering edit mode

It could be none of your 166 genes were in KEGG module database (a manual curated database). you can try KEGG Pathway, or other databases.

ADD REPLY
0
Entering edit mode

I am having the same problem too... The code was previously working fine. Have you guys sorted it out?

ADD REPLY
0
Entering edit mode

use_internal_data=TRUE, add the params maybe can solve the problem.

KEGG analysis need reading KEGG annotation online, so if your network is bad, it maybe failed.

ADD REPLY
0
Entering edit mode

What versions of BioConductor/clusterProfiler are you using? I had problems before updating to Bioconductor 3.16 and clusterProfiler 4.6.2, then things worked for me.

ADD REPLY
0
Entering edit mode
6.7 years ago
ARich ▴ 130

I got the answer to my problem. I was using some cutoff. After the cutoff few comparison had no genes left for down-regulated list This was kind of warning message more that a error.

ADD COMMENT
2
Entering edit mode
4.7 years ago
wm ▴ 570

"No gene can be mapped...."

it is because, None of the gene input was included in the database (KEGG Pathway or KEGG module).

We can check the gene before enrichment analysis.

> bitr_kegg(gene, fromType = "kegg", toType = "Path", organism = "hsa")
> bitr_kegg(gene, fromType = "kegg", toType = "Module", organism = "hsa")

source code for the function: DOSE::enricher_internal()

## file: DOSE/R/enricher_internal.R
## line30-44


## query external ID to Term ID
gene <- as.character(unique(gene))
qExtID2TermID <- EXTID2TERMID(gene, USER_DATA)
qTermID <- unlist(qExtID2TermID)
if (is.null(qTermID)) {
    message("--> No gene can be mapped....")

    p2e <- get("PATHID2EXTID", envir=USER_DATA)
    sg <- unlist(p2e[1:10])
    sg <- sample(sg, min(length(sg), 6))
    message("--> Expected input gene ID: ", paste0(sg, collapse=','))

    message("--> return NULL...")
    return(NULL)
}
ADD COMMENT
0
Entering edit mode
6.0 years ago

Hi sbbinfo,

I found that there is something wrong with the help document of enrichKEGG. The "gene" parameter is not Entrez gene ID for other organisms (not "hsa"). For "uma", you should input a vector in "UMAG_02115, UMAG_00118" format, not Entrez gene ID vector.

Best wishes.

ADD COMMENT
0
Entering edit mode

I know that it could be late. But I am trying to do the analysis for zebrafish. As you mentioned, enrichKEGG function does not work when I used Entrez gene ID. Where did you find this format? I would like to know the input for zebrafish. I just tryied "dre_102725537,dre_795613,dre_393541,dre_100000710,dre_325037,dre_558156" and "DRE_102725537,DRE_795613,DRE_393541,DRE_100000710,DRE_325037,DRE_558156". Both did not work. May someone help me?

Thank you!

ADD REPLY
0
Entering edit mode

I'm having the same issue here.. The weirdest thing is that my code ran perfectly okay two weeks ago - but now it just does not work anymore.. I wonder if there was an update for the package or anything...

ADD REPLY
0
Entering edit mode

That's so weird. I got the same issue too and don't know how to solve it, maybe I should change the database...

ADD REPLY
0
Entering edit mode

I have having the same issue too. Wonder if you have figured it out?

ADD REPLY
0
Entering edit mode

I am facing the same problem. It used to work a little while ago for running KEGG enrichment on human Entrez IDs. Now, it does not recognize the input Entrez IDs.

kk <- enrichKEGG(gene = gene, 
                       organism = "hsa", 
                       keyType = "kegg", 
                       pvalueCutoff = 0.05, 
                       pAdjustMethod = "BH", 
                       universe = annotatedData$gene_id, 
                       minGSSize = 10, 
                       maxGSSize = 500, 
                       qvalueCutoff = 0.2, 
                       use_internal_data = FALSE )

The error message:

--> No gene can be mapped....
--> Expected input gene ID:
--> return NULL... 

clusterProfiler v4.2.2

R v4.1.2

I also tried running it on all genes (gene = annotatedData$gene_id). Still no IDs could have been mapped.

These function work as expected though, using the same Entrez IDs as input:

enrichMKEGG(gene = gene,
                 organism = 'hsa',
                 pvalueCutoff = 1,
                 qvalueCutoff = 1)

enrichWP(gene, 
organism = "Homo sapiens")

Does anyone have any hints?

ADD REPLY
0
Entering edit mode

Since this is working with KEGG perhaps they may have stopped providing access to this package.

Would you mind letting the author know by using "contact by email" link on the project page: https://guangchuangyu.github.io/software/clusterProfiler/

ADD REPLY
0
Entering edit mode

You were right. The author kindly replied and I also found this: https://github.com/YuLab-SMU/clusterProfiler/issues/561

The latest github version of clusterProfiler is supposed to work.

remotes::install_github("YuLab-SMU/clusterProfiler") 

version 4.7.1.3

The DOSE package has to be updated, too.

This worked for me a couple of days ago but it stopped working again. So this issue may not be completely resolved.

ADD REPLY
0
Entering edit mode
19 months ago
coggy • 0

Just FYI, I was also having the same trouble in KEGG pathway analysis by clusterProfiler. I updated BiocManager and clusterProfiler (4.2.2 -> 4.6.2), then it worked. It seems that the previous version of clusterProfiler doesn't function at this time.

ADD COMMENT
0
Entering edit mode
18 months ago

use_internal_data=TRUE

the kegg analysis need read KEGG annotation online, so if your network is bad, the analysis maybe failed;

i run the enrichKEGG without 'use_internal_data=TRUE' in windows, it can work; while it failed on linux cluster and when i add 'use_internal_data=TRUE' , it worked.

ADD COMMENT

Login before adding your answer.

Traffic: 2700 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6