Functional profiling visualization enrichplot error: NAs introduced by coercion
0
1
Entering edit mode
3.1 years ago
Lepomis_8 ▴ 30

Hello,

I'm having trouble generating a dotplot of functionally enriched pathways across different groups. Most of them are mapped as NA via

Warning message: In order(as.numeric(unique(result$Cluster))) : NAs introduced by coercion

I'm trying to figure out why this happens. I've included the code to the original code I am adapting it from, retrieved from https://f1000research.com/articles/9-709. However, even if I follow their code, I get NAs, so I wonder if someone else can replicate it without NAs. I appreciate any help I can get!

Here is my code below:

    ## gprofiler 

    # load the package 
library(gprofiler2)

    # installing additional packages
    # if (!requireNamespace("BiocManager", quietly = TRUE))
    # install.packages("BiocManager") BiocManager::install(c("clusterProfiler", "enrichplot", "DOSE"))

    # loading the additional packages 
library(clusterProfiler) 
library(enrichplot) 
library(DOSE) # needed to convert to enrichResult object 
library(airway) 
library(DESeq2) 
library(gprofiler2)

    # load the airway data data(airway)
    # construct the DESeqDataSet object 
ddsMat = DESeqDataSetFromMatrix(countData = assay(airway),
                                    colData = colData(airway),
                                    design = ~ cell + dex)
    # run DESeq2 pipeline 
dds = DESeq(ddsMat)
    # get the results 
results = results(dds, contrast = c("dex", "trt", "untrt"),
                      alpha = 0.05, lfcThreshold = 1)
    # keep only the significant genes 
results_sig = subset(results, padj < 0.05)
    # get the significant up-regulated genes 
up = subset(results_sig, log2FoldChange > 0)
    # get the significant down-regulated genes 
down = subset(results_sig, log2FoldChange < 0)


    up_names = gconvert(row.names(up)) down_names = gconvert(row.names(down))

    # enrichment analysis using gene names 
multi_gp = gost(list("up-regulated" = up_names$name,
                         "down-regulated" = down_names$name), multi_query = FALSE, evcodes = TRUE)

    # modify the g:Profiler data frame 
gp_mod = multi_gp$result[,c("query", "source", "term_id",
                                "term_name", "p_value", "query_size", 
                                "intersection_size", "term_size", 
                                "effective_domain_size", "intersection")] 

gp_mod$GeneRatio = paste0(gp_mod$intersection_size,  "/", gp_mod$query_size)


    gp_mod$BgRatio = paste0(gp_mod$term_size, "/", gp_mod$effective_domain_size) names(gp_mod) = c("Cluster", "Category", "ID", "Description", "p.adjust", 
                      "query_size", "Count", "term_size", "effective_domain_size", 
                      "geneID", "GeneRatio", "BgRatio") 

gp_mod$geneID = gsub(",", "/", gp_mod$geneID) 
row.names(gp_mod) = gp_mod$ID

    # define as compareClusterResult object 
gp_mod_cluster = new("compareClusterResult", compareClusterResult = gp_mod)

    # define as enrichResult object 
gp_mod_enrich  = new("enrichResult", result = gp_mod)


    enrichplot::dotplot(gp_mod_cluster)
cluster profiler enrichplot R • 2.5k views
ADD COMMENT
0
Entering edit mode

Can't say without seeing your gene list, but it's likely some of your gene/cluster names can't be converted to numeric using as.numeric(), resulting in NAs. See here: https://statisticsglobe.com/warning-message-nas-introduced-by-coercion-in-r

ADD REPLY
0
Entering edit mode

Thanks. I noticed the formatting of my code was messed up, so I fixed it. The gene list came from the library(airway). If you'll notice there is a line gp_mod$geneID = gsub(",", "/", gp_mod$geneID) that was the solution in the article you linked, and it still didn't remove the NAs! I tried changing it to gp_mod$geneID = gsub(",", "", gp_mod$geneID) and still no change.

ADD REPLY
1
Entering edit mode

The problem occurs because the gp_mod_cluster is not properly formatted for enrichplot::dotplot. I'm not sure if any changes to gprofiler2 or enrichplot::dotplot since the publication of the paper you linked has caused this problem.

dotplot internally calls a function called ggplot2::fortify, which, as far as I can tell, is intended to convert gp_mod_cluster into a more tidy dataframe for plotting. The warning ("Warning message: In order(as.numeric(unique(result$Cluster))) : NAs introduced by coercion") occurs because fortify() is trying to perform: order(as.numeric(unique(result$Cluster))). However, this doesn't work because the $Cluster can't be converted to numeric:

> gp_mod_cluster@compareClusterResult$Cluster
 [1] "down-regulated" "down-regulated" "down-regulated" "down-regulated" "down-regulated"
 [6] "down-regulated" "down-regulated" "down-regulated" "down-regulated" "down-regulated"
[11] "down-regulated" "down-regulated" "down-regulated" "down-regulated" "down-regulated"
[16] "down-regulated" "down-regulated" "down-regulated" "down-regulated" "down-regulated"
[21] "down-regulated" "down-regulated" "down-regulated" "down-regulated" "down-regulated"
[26] "down-regulated" "down-regulated" "down-regulated" "down-regulated" "up-regulated"  
[31] "up-regulated"   "up-regulated"   "up-regulated"   "up-regulated"   "up-regulated"  
[36] "up-regulated"   "up-regulated"   "up-regulated"   "up-regulated"   "up-regulated"  
[41] "up-regulated"   "up-regulated"   "up-regulated"   "up-regulated"   "up-regulated"  
[46] "up-regulated"   "up-regulated"   "up-regulated"   "up-regulated"   "up-regulated"  
[51] "up-regulated"   "up-regulated"   "up-regulated"   "up-regulated"   "up-regulated"  
[56] "up-regulated"   "up-regulated"   "up-regulated"
ADD REPLY
0
Entering edit mode

Okay, thank you! I will try to get around this by just creating a dotplot manually.

ADD REPLY

Login before adding your answer.

Traffic: 1567 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6