clusterProfiler
1
0
Entering edit mode
17 months ago
CaroH ▴ 10

Dear all,

I am currently performing GSEA analysis on R on bulk RNA-sequencing datasets. I am using the clusterProfiler packages with the gseGO & gseKEGG functions. I have converted the names of the genes from Symbol to ENTREZID, and sorted the list based on log2FC.

I have quite a few KEGG pathways coming up in the analyis using gseKEGG. I wanted to test the gseMKEGG function. However, when using the same list, there is no term enriched under specific pvalueCutoff. I get the following message :

"preparing geneSet collections...
GSEA analysis...
no term enriched under specific pvalueCutoff...
Warning messages:
1: In preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam,  :
  There are ties in the preranked stats (13.54% of the list).
The order of those tied genes will be arbitrary, which may produce unexpected results.
2: In serialize(data, node$con) :
  'package:stats' may not be available when loading"

I used the following code for gseMKEGG :

mmkk2 <- gseMKEGG(geneList=gene_list, 
                  organism = "mmu", 
                  pvalueCutoff = 0.05)

Shouldn't the results of gseMKEGG be similar to gseKEGG?

Thanks in advance for your help !

clusterProfiler • 1.4k views
ADD COMMENT
0
Entering edit mode
17 months ago
LauferVA 4.5k

May want to look further into this ...

1: In preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam, : There are ties in the preranked stats (13.54% of the list). The order of those tied genes will be arbitrary, which may produce unexpected results.

this issue has been discussed before on Biostars, and also for instance on Github (e.g. https://github.com/YuLab-SMU/clusterProfiler/issues/214). All tied genes should theoretically have the same rank, but they are typically just assigned a rank at random. i dont know how many genes are in your dataset, but suppose its 25,000. Then you have ~3250 genes with tied p-values, and therefore 3250 genes that have ambiguous ranks.

it could be worth looking into why you have so many tied genes (e.g., did you retain genes with very low counts?).

it is possible the difference between the two is originating in a different way, but this is a pretty good place to start.

ADD COMMENT

Login before adding your answer.

Traffic: 1833 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6