I have three datasets of DEGs from different experiments. Previously, I had plotted each dataset on its own using the R package enhanced volcano. However my boss now wants me to combine all three datasets into one volcano plot similar to the one below
The code was not supplied in the paper and I do not know of any functionality to do this in enhanced volcano so my assumption is I would have to do this using ggplot2 but I have been having trouble even plotting one dataset with the numbers. My would be to combine all datasets into a single data frame, with columns for log2fc, -log10(adjp), a number corresponding to the genes of interest, and dataset as follows
data1<-read.csv('dataset1.csv',sep=',',row.index=1)
data2<-read.csv('dataset2.csv',sep=',',row.index=1)
data3<-read.csv('dataset3.csv',sep=',',row.index=1)
genes=c('STAT1','EGFR','PTEN' ...) # genes of interest
# going to add rows to number genes of interest while leaving all others blank
gene_label<-rep(NA,nrow(data1))
gene_num<-1:length(genes)
for (j in seq_along(genes))){
if (genes[j] %in% rownames(data1)){
gene_label[which,rownames(data1)==genes[j])]<-gene_num[j]}}
# Enter new row for dataset
dataset=rep(1,nrow(data1))
# create new dataset
DF1<-cbind(data1,gene_label,dataset)
# Will probably have to change row names because data frames don't allow duplicate row #names
rows1<-row.names(DF1)
rows1<-paste0(rows1,'_A')
row.names(DF1)<-rows1
# Repeat steps for other datasets ....
# finally combine datasets
DFplot<-rbind(DF1,DF2,DF3)
Now is where I do not know where to go. I do not need different point shapes as the figure does but need them to be different colors was well as the numbers for the genes of interest. I can put the table in after the fact using post processing but if there is a way to do it in the figure that would be great if anybody knows how. Thanks
Thanks so Much!