enhancedvolcano plot rowname question
1
0
Entering edit mode
5.8 years ago
nsmaan ▴ 10

Hello All, I am using one of your sample scripts to test my data in volcano plots. My files has 3 columns: Gene; Log2FoldchangeC; and pvalue But by default i keep getting the rownames as labels - What i want to plot is "Gene" name as labels (and not rownames like 1, 2, 3, etc.)

Could you please help?

This is the sample script:

res <- read.table("results.txt", header=TRUE)

head(res)

#rownames(res) <- sub("Gene", "", rownames(res))

EnhancedVolcano(res,
    lab = rownames(res),
    x = "log2FoldChange",
    y = "pvalue",
    ylab = bquote(~-Log[10]~italic(Pvalue)),
    pCutoff = 10e-5,
    FCcutoff = 1.5,
    #xlim=c(-5.5, 5.5),
    #ylim=c(0, -log10(10e-12)),
    transcriptLabSize = 3.5,
    title = "Drug+Toxin VS Ctrl results",
    legendPosition = "right",
    legendLabSize = 14,
    col = c("grey30", "forestgreen", "royalblue", "red2"),
    colAlpha=0.9,
    #DrawConnectors = TRUE,
    widthConnectors=0.2)
RNA-Seq • 8.1k views
ADD COMMENT
0
Entering edit mode

Kevin, Thank you very much for your prompt response. Please also let me know how to: Put different shapes, size and colors to left and right labeled genes (for e.g., to have genes on right to be Red, Star shape, and of larger size than default).

P.S. I am installing your newer ver. for EV plots

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

This comment belongs under @Kevin's answer.

ADD REPLY
4
Entering edit mode
5.8 years ago

Hey, you then just need to specify:

lab = res$Gene

Which version are you using, by the way? In the latest [devel] release, DrawConnectors has become drawConnectors. A lot of other new features have been added, too: https://github.com/kevinblighe/EnhancedVolcano

You can install the latest with:

devtools::install_github('kevinblighe/EnhancedVolcano')

Kevin

ADD COMMENT
0
Entering edit mode

Kevin, Other experts: Saw this volcano plot picture in this 2011 paper: They showed FDR and FC cutoff-based plot where top selected number of (e.g. 30) genes were labeled. They also represented the absolute fold change of all genes by the circle size and different color codes. Could anyone please provide sample scripts to do something similar if they have it available? Thank you,

Image Link: https://www.researchgate.net/figure/A-volcano-plot-representation-of-the-differentially-expressed-genes-in-a-pair-wise_fig2_51862161

ADD REPLY
1
Entering edit mode

That plot could be mostly reproduced using EnhancedVolcano. The functionality for point size scaling based on statistical significance is not yet available, but it will be [available] in later versions. The functionality for label boxes with lines drawn to the points is already available (look in the vignette). The functionality for drawing ellipses around points is not yet available. I instead chose to give users to identify groups of transcripts / genes by:

  • colour
  • shape
  • shade
ADD REPLY
0
Entering edit mode

Functionality to change the shape of the points was only recently added. See the vignette at these parts:

It will take you a bit of work to do this if you are a beginner in R.

ADD REPLY
0
Entering edit mode

Yes, unfortunately i am a beginner. However, to learn i am going through example-by-example that you have provided in "Publication-ready volcano plots with enhanced colouring and labeling". 1) The top few works fine, but when i get to add shape, it says unused argument (shape =8). I have version EnhancedVolcano_1.0.1 2) Also, how can we get number counts of data points that we have in each quadrangle of the plot?

Thank you, and i appreciate the time users take to answer the comments...

ADD REPLY
1
Entering edit mode

If you install via devtools::install_github('kevinblighe/EnhancedVolcano'), the version should be 1.1.3. Can you confirm?

For the issues with shape, you will have to post all commands that you're using, and also a sample of your input data.

Also, how can we get number counts of data points that we have in each quadrangle of the plot?

I would honestly just do that manually. For example, this will find genes with pvalue<0.01 & log2FoldChange > 2:

nrow(subset(res, pvalue<0.01 & log2FoldChange > 2))
ADD REPLY
0
Entering edit mode

Thanks K. yes, with install_github('kevinblighe/EnhancedVolcano'), shape is working. I think. I am close to getting what I want using the script below:

1) However, I am still getting numbers instead of 'Gene' name in my labels,

2) Also, I am doing something wrong in the cutoff for labels, for me this line works

keyvals[which(res2$log2FoldChange > 2.0)] <- 'green'

but when i change it to

keyvals[which(res2, padj<0.05 & log2FoldChange > 2.0)] <- 'green'

it is all black.

head(res)
   Gene log2FoldChange   pvalue     padj
1 Ptprs      -1.044483 3.88e-14 4.73e-10
2 Cd163      -4.219374 4.16e-13 1.02e-09
3  Fcna      -3.046358 5.03e-13 1.02e-09
4 Ces2j       1.825855 2.54e-13 1.02e-09
5 Vsig4      -5.002890 4.33e-13 1.02e-09
6 Ces2a       1.739246 7.79e-13 1.31e-09

    library(dplyr)
library(ggplot2)
library(ggrepel)
library(EnhancedVolcano)
res2 <- read.table("results.txt", header=TRUE)
keyvals <- rep('black', nrow(res2))
names(keyvals) <- rep('Mid', nrow(res2))
keyvals[which(res2$log2FoldChange > 2.0)] <- 'green'
names(keyvals)[which(res2$log2FoldChange > 2.0)] <- 'high'
keyvals[which(res2$log2FoldChange < -2.0)] <- 'royalblue'
names(keyvals)[which(res2$log2FoldChange < -2.0)] <- 'low'
unique(names(keyvals))
unique(keyvals)
keyvals[1:20]
EnhancedVolcano(res2,
    lab = res2$Gene,
    x = 'log2FoldChange',
    y = 'padj',
    selectLab =res2$Gene[which(names(keyvals) %in% c('high', 'low'))],
    xlim = c(-8,8),
    xlab = bquote(~Log[2]~ 'fold change'),
    ylab = bquote(~Log[10]~ 'padj'),
    title = 'Custom colour over-ride',
    pCutoff = 10e-6,
    FCcutoff = 2.0,
    transcriptPointSize = 1,
    transcriptLabSize = 4.5,
    shape = c(6, 4, 2, 11),
    colCustom = keyvals,
    colAlpha = 1,
    legendPosition = 'top',
    legendLabSize = 15,
    legendIconSize = 5.0,
    drawConnectors = FALSE,
    widthConnectors = 0.5,
    colConnectors = 'grey50',
    gridlines.major = TRUE,
    gridlines.minor = FALSE,
    border = 'partial',
    borderWidth = 1.5,
    borderColour = 'black')
ADD REPLY
0
Entering edit mode

1) However, I am still getting numbers instead of 'Gene' name in my labels,

That is likely because your Gene variable is encoded as a factor. Try this before running anything else:

res$Gene <- as.character(res$Gene)

For the other part, you need to do:

keyvals[which(res2$padj<0.05 & res2$log2FoldChange > 2.0)] <- 'green'
ADD REPLY
0
Entering edit mode

Yes. Gene was a factor. Working now. Still the script is showing error in reading object 'res2log2FoldChange'

I modified it as:

    res2 <- read.table("results.txt", header=TRUE)
res2$Gene <- as.character(res2$Gene)
keyvals <- rep('black', nrow(res2))
names(keyvals) <- rep('Mid', nrow(res2))
keyvals[which(res2$padj<0.05 & res2log2FoldChange > 2.0)] <- 'green'
names(keyvals)[which(res2$log2FoldChange > 2.0)] <- 'high'
keyvals[which(res2$padj<0.05 & res2log2FoldChange < -2.0)] <- 'royalblue'
names(keyvals)[which(res2$log2FoldChange < -2.0)] <- 'low'
unique(names(keyvals))
unique(keyvals)
keyvals[1:20]
EnhancedVolcano(res2,
    lab = res2$Gene,
    x = 'log2FoldChange',
    y = 'padj',
    selectLab =res2$Gene[which(names(keyvals) %in% c('high', 'low'))],
..........
ADD REPLY
1
Entering edit mode

Check your code again. You are missing a dollar, $, in one of your lines:

keyvals[which(res2$padj<0.05 & res2log2FoldChange > 2.0)] <- 'green'
ADD REPLY
0
Entering edit mode

Embarrassed.

Script working well now. Also, out of curiosity, is it possible to change just the 'labeled genes' as something different (for e.g., as filled markers or larger size markers).

Thanks for all your help!

ADD REPLY
1
Entering edit mode

Some shapes are by default filled or unfilled. If you look in the vignette, you can see how some are unfilled, for example, like here:

ex4-2

You can have certain shapes for your genes of interest via the shapeCustom parameter. It functions in the same way as colCustom. An example here: https://github.com/kevinblighe/EnhancedVolcano#over-ride-colour-andor-shape-scheme-with-custom-key-value-pairs

You can check all possible shapes here: http://sape.inf.usi.ch/quick-reference/ggplot2/shape

Regarding size, currently, you can only change the global size for all shapes via transcriptPointSize. In the next version, I will add functionality to have different sizes. Note, however, that by default some shapes are different sizes. For example 20 and 21 are both circles but 21 is larger.

ADD REPLY
0
Entering edit mode

This is very cool. I am playing around with these options now for the best view. Thanks. Though I noticed one more thing that i am not sure how to correct: In the script, if i toggle the drawConnectors as TRUE or FALSE (everything else same), I get lot more labeled genes in TRUE. My aim is to only have connector lines in the genes that are labeled when i used the FALSE option above.

ADD REPLY
1
Entering edit mode

Yes, more labels will fit in the plot space with drawConnectors = TRUE. If you are only interested in labeling certain genes, then just pass these genes as a vector to selectLab. You can also now draw a box around each label with boxedlabels = TRUE

ADD REPLY

Login before adding your answer.

Traffic: 1625 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6