Volcano_plot using R
1
0
Entering edit mode
5.9 years ago
Shina ▴ 10

using the following R Script i want to generate Volcano plot.but my volcano plot is not correct because their is no connecting point between -logpvalue and log2foldchange value if i run the example as provided on web i get the corrected one its mean the script is correct but something is wrong in my data i try alot but not able to identify the mistake.help me in this regard.

My data:

Gene    log2FoldChanges pvalue  pvalue
CAR3    -2.024  0.0103  0.0103
SAA2    -2.019  0.0624  0.0624
GADD45A -1.752  0.0015  0.0015
SAA1    -1.599  0.0735  0.0735
NREP    -1.265  0.0210  0.021
GM6484  -1.134  0.0034  0.0034
ARRDC3  -1.085  0.0104  0.0104
NRP1    -1.082  0.0108  0.0108
TGM1    -1.072  0.0070  0.007
CYP7A1  -1.068  0.0128  0.0128
RNF186  -1.020  0.0054  0.0054
PIK3CD  -1.003  0.0007  0.0007
SOCS3   -0.963  0.0193  0.0193
F3  -0.945  0.0059  0.0059
MID1IP1 -0.942  0.0116  0.0116
SERINC3 -0.941  0.0006  0.0006
GCK -0.933  0.0000  0

Code

res <- read.table("results.txt", header=TRUE)
head(res)
alpha <- 0.05 # Threshold on the adjusted p-value
cols <- densCols(res$log2FoldChange, -log10(res$pvalue))
plot(res$log2FoldChange, -log10(res$pvalue), col=cols, panel.first=grid(),
     main="Volcano plot D3 VS D1", xlab="log2(fold-change)", ylab="pvalue",
     pch=20, cex=0.6)
abline(v=0)
abline(v=c(-1,1), col="brown")
abline(h=-log10(alpha), col="brown")
with(subset(res, pvalue>alpha | abs(log2FoldChange)<.6), points(log2FoldChange, -log10(pvalue), pch=20, col="blue"))
with(subset(res, pvalue<alpha | abs(log2FoldChange)<.6), points(log2FoldChange, -log10(pvalue), pch=20, col="black"))
with(subset(res, pvalue>alpha | abs(log2FoldChange)>.6), points(log2FoldChange, -log10(pvalue), pch=20, col="black"))
with(subset(res, pvalue<alpha & abs(log2FoldChange)>=.6  & pvalue<.5 & abs(log2FoldChange)<1), points(log2FoldChange, -log10(pvalue), pch=20, col="orange"))
with(subset(res, pvalue<alpha & abs(log2FoldChange)>=1), points(log2FoldChange, -log10(pvalue), pch=20, col="green"))
R DEGs Volcanoplot • 14k views
ADD COMMENT
1
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.
code_formatting

ADD REPLY
0
Entering edit mode

Thanks for your suggestion i will do that.

ADD REPLY
0
Entering edit mode

I tried but always get this error message

 devtools::install_github('kevinblighe/EnhancedVolcano')
   Error in curl::curl_fetch_memory(url, handle = h) : 
      Failed to connect to api.github.com port 443: Timed out
  
ADD REPLY
1
Entering edit mode

That error is not related to the GitHub repository. It is likely some local issue, i.e., a problem on your side of things. Check that there are no restrictions in place for downloading and installing.

ADD REPLY
1
Entering edit mode

Its not possible to download on my lab PC i will try it on my laptop hopefully it will work thanks for your help stay bless!

ADD REPLY
1
Entering edit mode

For your data, here is the example code and plot:

test=read.csv("test.txt", header = T, stringsAsFactors = F, strip.white = T, sep="\t")
library(ggplot2)

ggplot(test,
       aes(log2FoldChanges, -log10(pvalue), color = factor(
         ifelse(
           abs(log2FoldChanges) > 1.2 &
             abs(log2FoldChanges) < 1.8,
           "sig",
           "nonsig"
         ))))+
  geom_point() +
  geom_hline(yintercept = 2, color = "red") +
  geom_vline(xintercept = c(-1.8, -1.2), color = "red") +
  scale_color_manual(name = "Significance",
                     values = c("red", "blue"))

Rplot

ADD REPLY
0
Entering edit mode

Thanks how can i add the names of the significant genes in a more good way and avoid the overlapping and also put the connector.

ADD REPLY
0
Entering edit mode

try this:

ggplot(test,
       aes(log2FoldChanges, -log10(pvalue), label=Gene, color = 
         ifelse(
           abs(log2FoldChanges) > 1.2 &
             abs(log2FoldChanges) < 1.8,
           "Sig",
           "Nonsig"
         )))+
  geom_point() +
  ylab(expression (log[10]~"P"))+
  xlab(expression(log[2]~Fold~Changes))+
  geom_hline(yintercept = 2, color = "red") +
  geom_vline(xintercept = c(-1.8, -1.2), color = "red") +
  scale_color_manual(name = "Significance", values = c("red", "blue"))+
  geom_text_repel(show.legend = FALSE)+
  theme_bw(base_size = 14)

Rplot01

ADD REPLY
1
Entering edit mode

For partial labeling:

ggplot(test,
       aes(log2FoldChanges, -log10(pvalue), label=ifelse(
         abs(log2FoldChanges) < 1.2 |
           abs(log2FoldChanges) > 1.8,
         Gene,
         ""
       ), color = 
         ifelse(
           abs(log2FoldChanges) > 1.2 &
             abs(log2FoldChanges) < 1.8,
           "Sig",
           "Nonsig"
         )))+
  geom_point() +
  ylab(expression (log[10]~"P"))+
  xlab(expression(log[2]~Fold~Changes))+
  geom_hline(yintercept = 2, color = "red") +
  geom_vline(xintercept = c(-1.8, -1.2), color = "red") +
  scale_color_manual(name = "Significance", values = c("red", "blue"))+
  geom_text_repel(show.legend = F)+
  theme_bw(base_size = 14)

Rplot02

ADD REPLY
0
Entering edit mode

thanks for the code I try it but it always gives this error

Error in geom_text_repel(show.legend = FALSE) : 
  could not find function "geom_text_repel"
ADD REPLY
1
Entering edit mode

geom_text_repel is from the package ggrepel. So,:

library(ggrepel)
ADD REPLY
0
Entering edit mode

Is there any way to put the names of the Genes having high fold change values with connectors.? using Calibrate library.

library(calibrate)

here is my code

with(subset(res, pvalue<.05 & abs(log2FoldChanges)>1.6), textxy(log2FoldChanges, -log10(pvalue), labs=Gene, cex=.6))
ADD REPLY
1
Entering edit mode

You can easily do that with EnhancedVolcano by setting thresholds for log2FC and / or specifying your genes of interest with selectLab. Finally, to draw the connectors, set drawConnectors = TRUE. If you want boxes around the labels, set boxedLabels = TRUE.

An example of this, HERE

ex10-1

ADD REPLY
1
Entering edit mode
5.9 years ago
k.kathirvel93 ▴ 310

Its log2FoldChanges not log2FoldChange in your code.

ADD COMMENT
0
Entering edit mode

That is also a mistake thanks for pointing out .

ADD REPLY
0
Entering edit mode

Can you mention which tool you have used for DE? Why don't you use FDR value instead od PValue? and can you explain that how the plot shoul be ?

ADD REPLY
0
Entering edit mode

i uesed GeneXplain to identify DEGs their they only have p value and fold change calculation and they calculate the FDR for the whole number of Genes rather then for indivisuals so thats why i used the pvalue calculations rather then FDR, the problem with the plot is that their are no genes near to Zero Scale but later on i came to know that its because of my data distribution not by the script.

ADD REPLY

Login before adding your answer.

Traffic: 1620 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6