How to create a GSEA enrichment plot for the phenotypes in R?
1
1
Entering edit mode
4.6 years ago
newbie ▴ 130

Dear all,

I came across a paper Recurrently deregulated lncRNAs in hepatocellular carcinoma in which they have done GSEA using lncRNA expression data. Check the Figure 3a and 3b, And the figure looks like below:

enter image description here

In the Methods section I see the following information:

We used GSEA (v2.0.13) to assess enrichment of sets of recurrently deregulated lncRNAs in other data sets. GSEA requires three input files: a gene set, expression data and phenotype labels. We used the recurrently deregulated lncRNA set (the tumorigenesis set or metastasis set) as the gene set. Expression data were derived from the TCGA cohort or published liver cancer data. For the published data, we used sample information (adjacent normal tissue, primary tumour and PVTT) as phenotype labels. Because the TCGA LIHC cohort had no PVTT or other metastasis samples, we classified primary tumour samples into invasion and non-invasion groups based on clinical information (T stages in the TNM staging system: T1 versus T2–T4). The lncRNAs were rank-ordered by differential expression (signal2Noise in GSEA v2.0.13) between the two groups.

I'm interested in making a similar figure with my data in R. Can anyone tell me how do I do that? Is there any R package to do this? Please give me an example. thanq.

RNA-Seq r gsea lncrna • 16k views
ADD COMMENT
0
Entering edit mode

Try this answer for plotting GSEA output in R

This answer mentions limma's barcodeplot. With limma you may perform all the analysis in R, including the plot. (Take note that limma's cameragene set analysis is similar, but not equivalent to GSEA)

ADD REPLY
0
Entering edit mode

thanks, but for the first answer I guess I have to used javaGSEA desktop version first and the output of that can be used to plot. And the second one from Lima barcode plot is doesn't look like the one I posted in my question.

Is there any other way you are aware of?

ADD REPLY
0
Entering edit mode

Which kind of data do you have?

ADD REPLY
0
Entering edit mode

I have differentially expressed lncRNAs between Breast and Normal. I have the output from edgeR. Instead of heatmap, I wanted to create an Enrichment plot like above.

ADD REPLY
0
Entering edit mode

Any help please @ATpoint

ADD REPLY
1
Entering edit mode

You do not need to add multiple comments. If I or anyone else is online and feels like responding, we will do. The above plots are the output of gene set enrichment analysis, hence you have to perform GSEA. R packages for this are e.g. fgsea from Bioconductor which also has a plotting function that somewhat create these plots.

ADD REPLY
1
Entering edit mode
4.6 years ago
Barry Digby ★ 1.3k

Look at Stephen Turners example here using fgsea https://stephenturner.github.io/deseq-to-fgsea/#using_the_fgsea_package.

Plotting the Enrichment plot: Multiple plots outputs using for loop in R

You might want to use a different metric to calculate the 'rank' of your genes, ST uses DESeq2's 'stat' column.

ADD COMMENT
0
Entering edit mode

I don't want the pathway enrichment plot. I'm looking for enrichment plot for gene association with the condition. Please check the plot I posted in my question.

ADD REPLY
0
Entering edit mode

I see. It seems to me that they identified genes related to metastasis/tumorigenesis in the paper. Perhaps through domain knowledge you could combine multiple .gmt files related to metastasis/tumorigenesis to get a list of genes for each condition, or alternatively generate your own custom .gmt file from scratch. Regardless, you will need a gene set to test your ranked gene list against. best of luck!

ADD REPLY

Login before adding your answer.

Traffic: 2082 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6