fgsea returns empty dataframe
1
1
Entering edit mode
3.4 years ago

Hi all,

I'm new to the community so apologise if this is a basic question or has been dealt with in another thread. I've looked through the fgsea documentation and am still having no luck.

I'm trying to run fgsea() on some nmf-decomposed gene expression data and keep getting an empty dataframe. This is confusing me because I conducted the analysis using MSigDB's GSEA software, and this seemed to work - I got rank-ordered gene lists and enrichment plots for each column of W_dash, so I'm sure it can be done in fgsea.

I ran nmf() on some gene expression data, then normalised the resulting W matrix. Now my task is to run GSEA on the normalised W matrix (W_dash), using the columns of W_dash as my ranked gene lists. W_dash looks like this:

                 W1           W2           W3           W4
AC000066_at      5.054266e-23 1.022037e-03 1.161830e-03 9.154324e-04
AC000099_at      2.620042e-03 1.369911e-03 1.495072e-02 5.842306e-03
AC000115_cds1_at 4.382477e-04 1.945546e-10 3.731523e-10 6.260872e-10
AC002045_xpt1_at 5.716331e-03 1.776428e-02 2.414694e-02 2.651467e-02
AC002073_cds1_at 3.328479e-03 4.652305e-04 4.209797e-03 4.125737e-03
AC002077_at      5.054266e-23 1.448357e-03 8.369223e-04 1.646869e-03
AC002086_at      1.109709e-03 1.874855e-03 3.724489e-04 2.646966e-03
AC002115_cds1_at 4.459879e-02 8.844502e-02 1.009927e-01 2.488297e-01
AC002115_cds3_at 2.441240e-02 1.509489e-02 2.094655e-02 1.398258e-02
AC002115_cds4_at 1.430019e-10 1.237761e-03 7.450935e-10 5.510702e-10
AC002115_rna2_at 8.419673e-03 1.567666e-02 7.554787e-03 3.770747e-02

I import a gene set from MSigDB:

>pathways = fgsea::gmtPathways("./c4.cgn.v7.4.symbols.gmt")

pathways looks like this:

>pathways[1]

$MORF_ATRX
  [1] "ADCY3"    "SEC31A"   "BTD"      "LTBP4"    "UTRN"     "FIG4"     "AQP5"     "TMEM11"   "HNRNPL"  
 [10] "TAF5L"    "INSIG2"   "AGPS"     "LSM1"     "PIK3CB"   "HTR4"     "UBR2"     "ZNF500"   "RFC5"    
 [19] "ST13"     "SSTR5"    "PCF11"    "CLP1"     "MSX1"     "CEP350"   "COLQ"     "HTR7"     "MC2R"    
 [28] "NEK9"     "PIAS2"    "AMFR"     "PDXDC1"   "ZNF133"   "MRE11"    "MTX1"     "CACNB2"   "TMCC1"   
 [37] "PRELID3A" "JRK"      "PPP1R3D"  "NTPCR"    "NKRF"     "MLN"      "KLHDC10"  "OARD1"    "FDXR"    
 [46] "PAXIP1"   "WDR62"    "RSRP1"    "IPCEF1"   "COX6A2"   "POP4"     "TCF12"    "RERE"     "LTN1"    
 [55] "ZNF292"   "LEPROTL1" "KAZN"     "TTLL5"    "KRT33A"   "GTSE1"    "SEC62"    "PARN"     "MIA2"    
 [64] "DNAJC16"  "SLC24A1"  "SYNJ2"    "SIK3"     "MC5R"     "FANCG"    "GOLGA3"   "MSL3"     "ZBTB22"  

Now I need to run fgsea() on the first column of W_dash as an ordered list:

>stats= sort(W_dash[,1], decreasing = T)
>stats[1:5]
M63379_at  hum_alu_at S40719_s_at V00594_s_at   Z70759_at 
1.0000000   0.9656315   0.9164475   0.6934949   0.6814499

When I run fgsea() I get an empty dataframe:

>fgsea <- fgsea(pathways=pathways, stats=stats)
>fgsea
Empty data.table (0 rows and 8 cols): pathway,pval,padj,log2err,ES,NES...

Since the GSEA software produces the expected output, this leads me to believe I'm missing something really simple, but I can't work it out. Any help/guidance would be greatly appreciated!

Many thanks,

charlielonergan

GSEA NMF R fgsea gene-expression • 2.6k views
ADD COMMENT
3
Entering edit mode
3.4 years ago
ATpoint 85k

The names of your ranked stats and the names of the entries of pathways must be the same. You have gene names in one but probe identifiers in the other. How should fgsea know the connection between both if the names are different?

ADD COMMENT
0
Entering edit mode

Thanks! Is there a way to link the gene names and probe identifiers together so fgsea knows the connections?

The assays used were Affymetrix Hu6800 if that's relevant.

ADD REPLY
1
Entering edit mode

There are many posts, please use search function/google, e.g. Converting Affymetrix Probes To Gene Ids

ADD REPLY
0
Entering edit mode

Thank you so much! Sorry if this was too basic but you've really saved my bacon here :)

ADD REPLY

Login before adding your answer.

Traffic: 1666 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6