working with .gmt files
3
3
Entering edit mode
9.0 years ago
D H ▴ 30

Hi!

I have downloaded a pathway data set in .gmt format form the GSEA website.

I'm wondering how can I properly read this data set in R.

Could anyone help me?

Thank you!

gsea R • 20k views
ADD COMMENT
0
Entering edit mode

This post is related to R.

ADD REPLY
0
Entering edit mode

Thank you Zhilong!

Following the installation procedure (https://github.com/zhilongjia/cogena )

I tried to install devtools. However, I have Rstudio 2.15.2 and the devtools is not available.

Any recommendations?

ADD REPLY
0
Entering edit mode

you can install cogena via the following code:

source("https://bioconductor.org/biocLite.R")
biocLite("cogena")

Also, the package you used should be suitable for your research goal (I believe it's not only read a gmt file.), as Deepak listed two as well.

ADD REPLY
0
Entering edit mode

I tried the url but I get the following error:

Error in file(filename, "r", encoding = encoding) :
  cannot open the connection
In addition: Warning message:
In file(filename, "r", encoding = encoding) : unsupported URL scheme

By the way I use Mac, if that makes any difference.

ADD REPLY
0
Entering edit mode
ADD REPLY
4
Entering edit mode
9.0 years ago
Deepak Tanwar ★ 4.2k

You can read using read.gmt function from qusage package.

or by GSA.read.gmt function from GSA package.

ADD COMMENT
2
Entering edit mode
3.3 years ago
Ahmed Alhendi ▴ 240

you can try msigdbr package on https://cran.r-project.org/web/packages/msigdbr/vignettes/msigdbr-intro.html

it provides you with msigdb that compatable with fgsea and clusterProfiler. For example, I use it to do the fgea with human hallmark gene sets

library(msigdbr)
library(fgsea)

#Retrieve human H (hallmark gene sets) 
msigdbr_df <- msigdbr(species = "human", category = "H")


head(msigdbr_df)
# A tibble: 6 x 15
  gs_cat gs_subcat gs_name gene_symbol entrez_gene ensembl_gene human_gene_symb…
  <chr>  <chr>     <chr>   <chr>             <int> <chr>        <chr>           
1 H      ""        HALLMA… ABCA1                19 ENSG0000016… ABCA1           
2 H      ""        HALLMA… ABCB8             11194 ENSG0000019… ABCB8           
3 H      ""        HALLMA… ACAA2             10449 ENSG0000016… ACAA2           
4 H      ""        HALLMA… ACADL                33 ENSG0000011… ACADL           
5 H      ""        HALLMA… ACADM                34 ENSG0000011… ACADM           
6 H      ""        HALLMA… ACADS                35 ENSG0000012… ACADS      

# fixing format to work with fgsea
pathwaysH = split(x = msigdbr_df$entrez_gene, f = msigdbr_df$gs_name)

# run fgsea enrichment
fgseaRes <- fgsea(pathways=pathwaysH, ranks, ..)
ADD COMMENT
1
Entering edit mode
9.0 years ago
Zhilong Jia ★ 2.2k
  1. GSEABase package. http://svitsrv25.epfl.ch/R-doc/library/GSEABase/html/getObjects.html
  2. cogena::gmt2list()
ADD COMMENT

Login before adding your answer.

Traffic: 2035 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6