Volcano plot for multiple clusters
2
3
Entering edit mode
5.0 years ago

Hello, I am trying to make a volcano plot for different clusters. I have 2 conditions, untreated vs. treated. I have a differential expression excel file that cellranger generated for me but within the file it has multiple clusters each which have a fold change and p value. How do I create a volcano plot that contains all the clusters rather than one? Would I have to do a volcano plot for each cluster and then combine them all somehow?

I use this code to generate the plot for just one of the clusters...

macrophage_list <- read.table("differential_expression_macrophage.csv", header = T, sep = ",")

EnhancedVolcano(macrophage_list, lab = as.character(macrophage_list$FeatureName), x = 'untreated.Log2.Fold.Change', y = 'untreated.P.Value', xlim = c(-8,8), title = 'Macrophage', pCutoff = 10e-5, FCcutoff = 1.5, pointSize = 3.0, labSize = 3.0)

Any help and suggestions is greatly appreciated.

RNA-Seq R volcanoplot • 5.0k views
ADD COMMENT
5
Entering edit mode
5.0 years ago

Hey,

I would generate a separate plot, but keep each volcano within the same plot space. You can do this in this way:

v1 <- EnhancedVolcano(...)
v2 <- EnhancedVolcano(...)
v3 <- EnhancedVolcano(...)
v4 <- EnhancedVolcano(...)

library(gridExtra)
library(grid)
grid.arrange(v1, v2, v3, v4, ncol = 2, nrow = 2,
  top = textGrob('Macrophages', just = c('center'), gp = gpar(fontsize = 32)))
grid.rect(gp=gpar(fill=NA))

You could also bind the results tables together and plot all p-values and fold-changes in the same plot, but using, for example, a different shape for each respective comparison. Apart from requiring a bit more coding, this would also make the plot space very crowded, I think.

Kevin

ADD COMMENT
0
Entering edit mode

Hi Kevin,

So create a volcano plot for each of my 20 clusters and then combine them with the code you gave me?

Thanks

ADD REPLY
0
Entering edit mode

If you want.

ADD REPLY
0
Entering edit mode

I uploaded only 2 of the cluster as a test, ran the following code and got the following error.

grid.arrange(v1_macrophage, v2_macrophage, ncol = 2, nrow = 2, top = textGrob('Macrophages', just = c('center'), gp = gpar(fontsize = 32)))
Error in `$<-.data.frame`(`*tmp*`, "wrapvp", value = list(x = 0.5, y = 0.5,  : 
  replacement has 17 rows, data has 31328
ADD REPLY
0
Entering edit mode

Hey, how did you create v1_macrophage and v2_macrophage?

Could you just try:

grid.arrange(v1_macrophage, v2_macrophage, ncol = 2,
  top = textGrob('Macrophages', just = c('center'), gp = gpar(fontsize = 32)))
ADD REPLY
0
Entering edit mode

v1_macrophage <- read.table("cluster1_marcophage.csv", header = T, sep = ",") v2_macrophage <- read.table("cluster2_marcophage.csv", header = T, sep = ",")

Got this error for the following command:

Error in $<-.data.frame(*tmp*, "wrapvp", value = list(x = 0.5, y = 0.5, : replacement has 17 rows, data has 31328

ADD REPLY
0
Entering edit mode

Oh, they should be separate EnhancedVolcano objects, like this:

v1_macrophage <- EnhancedVolcano(...)
v2_macrophage <- EnhancedVolcano(...)

grid.arrange(v1_macrophage, v2_macrophage, ncol = 2,
  top = textGrob('Macrophages', just = c('center'), gp = gpar(fontsize = 32)))
ADD REPLY
0
Entering edit mode

Hi Kevin,

I created 2 separate EnhancedVolcano objects for 2 of the clusters. Everything worked fine, but I was wondering how I would combine the data into one plot rather than having 20 separate volcano plots?

I appreciate your help.

ADD REPLY
0
Entering edit mode

Then, you will have to try the other option, i.e., merge (rbind()) the results tables and just use that.

ADD REPLY
0
Entering edit mode

How do I go about merging the results table since all the p values and log fold change values are already in my file. They're just labeled based on which cluster it belongs to?

Thanks again for your help.

ADD REPLY
0
Entering edit mode

Oh, they are already in the same file? Can you paste an example of the data?

ADD REPLY
0
Entering edit mode

These are the headers.

Feature ID  Feature Name    Cluster 1 Mean Counts   Cluster 1 Log2 fold change  Cluster 1 Adjusted p value  Cluster 2 Mean Counts   Cluster 2 Log2 fold change  Cluster 2 Adjusted p value  Cluster 3 Mean Counts   Cluster 3 Log2 fold change  Cluster 3 Adjusted p value  Cluster 4 Mean Counts   Cluster 4 Log2 fold change  Cluster 4 Adjusted p value  Cluster 5 Mean Counts   Cluster 5 Log2 fold change  Cluster 5 Adjusted p value  Cluster 6 Mean Counts   Cluster 6 Log2 fold change  Cluster 6 Adjusted p value  Cluster 7 Mean Counts   Cluster 7 Log2 fold change  Cluster 7 Adjusted p value  Cluster 8 Mean Counts   Cluster 8 Log2 fold change  Cluster 8 Adjusted p value  Cluster 9 Mean Counts   Cluster 9 Log2 fold change  Cluster 9 Adjusted p value  Cluster 10 Mean Counts  Cluster 10 Log2 fold change Cluster 10 Adjusted p value Cluster 11 Mean Counts  Cluster 11 Log2 fold change Cluster 11 Adjusted p value Cluster 12 Mean Counts  Cluster 12 Log2 fold change Cluster 12 Adjusted p value Cluster 13 Mean Counts  Cluster 13 Log2 fold change Cluster 13 Adjusted p value Cluster 14 Mean Counts  Cluster 14 Log2 fold change Cluster 14 Adjusted p value Cluster 15 Mean Counts  Cluster 15 Log2 fold change Cluster 15 Adjusted p value Cluster 16 Mean Counts  Cluster 16 Log2 fold change Cluster 16 Adjusted p value Cluster 17 Mean Counts  Cluster 17 Log2 fold change Cluster 17 Adjusted p value Cluster 18 Mean Counts  Cluster 18 Log2 fold change Cluster 18 Adjusted p value Cluster 19 Mean Counts  Cluster 19 Log2 fold change Cluster 19 Adjusted p value Cluster 20 Mean Counts  Cluster 20 Log2 fold change Cluster 20 Adjusted p value
ADD REPLY
2
Entering edit mode
5.0 years ago
TriS ★ 4.7k

or if you want to put them all together you can just color each group differently.

library(ggplot2)  
ggplot(macrphage_list, aes(x = untreated.Log2.Fold.Change, y = -log10(untreated.P.Value), fill = myClusters)) + geom_point(size = 3) +  geom_hline(yinterecept=-log10(10e-5)) + geom_vline(xintercept = 0) + xlim(-8,8)

where myClusters is the factor that indicates the clusters you are interested in

ADD COMMENT
0
Entering edit mode

Hi TriS

So in the "myCluster" section put all the clusters that I want?

I have 20 clusters within the differential expression file, so do fill = Cluster 1 Log2 fold change, Cluster 1 Adjusted p value, etc?

The file is labeled with the following headers... Feature Name Cluster 1 Mean Counts Cluster 1 Log2 fold change Cluster 1 Adjusted p value (for all 20).

Thanks

ADD REPLY
0
Entering edit mode

if those are the column names then you must have some sort of genes as row names. you gotta do a little bit of coding to define what genes belong to what cluster. 1- based on the cluster p.value and Log2FC define what belongs to what cluster 2- repeat that for all 20 clusters 3- color each gene based on cluster # I don't know how you calculated the clusters but generally one gene should belong only to one cluster

the end results should give you something like:

GENE   CLUSTER
x   1
y   1
x   2
z   4
h   3

and so on and so forth...

ADD REPLY

Login before adding your answer.

Traffic: 1638 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6