Hello All, How can I cluster different GO_categories so that it does not look scattered as in my case?
0
0
Entering edit mode
3.1 years ago

The data frame is provided as an image:DataFrame

My R code is:

Bubble_plot_5 <- read_excel("GO_Bubble_plot.xlsx", sheet = "GO_5")
 view(Bubble_plot_5)

 ggplot(Bubble_plot_5, aes(y = reorder(GO_Term, as.numeric(GO_Category)), x = Gene_Count,
                              size = Gene_Count))+
      geom_point(aes(color = GO_Category), alpha = 3.0)+
      geom_tile(aes(width = Inf, fill = GO_Category), alpha = 0.4)+
      scale_fill_manual(values = c("green", "red", "blue"))

Bubbleplot

R Gene ggplot2 ontology. • 2.1k views
ADD COMMENT
0
Entering edit mode

Pro-tip: Rather than providing images of data just use dput() on the data.frame and copy/paste the output. Also, please remove "hello" from the title, this does not belong there.

ADD REPLY
0
Entering edit mode

Thank you for the suggestion. I will see to it next time. The dput() output is as follows:

structure(list(GO_Term = c("Translation", "Signal transduction", "Regulation of transcription, DNA-templated", "Regulation of cell shape", "Phosphorelay signal transduction system", "Peptidoglycan biosynthetic process", "Mo-molybdopterin cofactor biosynthetic process", "Methylation", "Intracellular protein transmembrane transport", "Fatty acid biosynthetic process", "Electron transport chain", "Chemotaxis", "Cell wall organization", "Cell division", "Carbohydrate metabolic process", "Bacterial-type flagellum dependent cell motility", "Plasma membrane", "Integral component of membrane", "Extracellular region", "Cytoplasm", "Bacterial-type flagellum basal body", "ATP-binding cassette (ABC) transporter complex", "Zinc ion binding", "Transmembrane transporter activity", "rRNA binding", "Phosphorelay sensor kinase activity", "Oxidoreductase activity", "Methyltransferase activity", "Metalloendopeptidase activity", "Metal ion Binding", "Iron-sulfur cluster binding", "Hydrolase activity ", "Heme binding", "Electron transfer activity", "ATPase-coupled transmembrane transporter activity", "ATPase-coupled cation transmembrane transporter activity", "ATP binding", "4 iron, 4 sulfur cluster binding"), Gene Count = c(27, 18, 34, 4, 29, 5, 3, 10, 1, 5, 2, 19, 8, 5, 12, 16, 107, 212, 5, 77, 16, 15, 18, 22, 18, 16, 18, 6, 5, 97, 18, 54, 12, 14, 25, 2, 114, 28), GO Category = c("BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "CC", "CC", "CC", "CC", "CC", "CC", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF" )), row.names = c(NA, -38L), class = c("tbl_df", "tbl", "data.frame" ))

ADD REPLY
0
Entering edit mode

Just rearrange the terms in the excel file in descending or ascending order and then read the file.

ADD REPLY
0
Entering edit mode

Could you show us the rest of your code please? (Including libraries.) There isn't anything inherently wrong with the code snippet here, and executing it on some dummy data produces a properly "grouped" plot.

ADD REPLY
1
Entering edit mode

I have provided the full table as dput() in a reply above. The complete code (including libraries) is as follows:

library(ggplot2)
library(forcats)
library(magrittr)
library(tidyverse)
library(readxl)
library("gplots")
library(writexl)
library(dplyr)
library(tidyr)
library(ggplot2)


Bubble_plot_5 <- read_excel("GO_Bubble_plot.xlsx", sheet = "GO_5")
view(Bubble_plot_5)

ggplot(Bubble_plot_5, aes(y = reorder(GO_Term, as.numeric(`GO Category`)), x = `Gene Count`,
                          size = `Gene Count`))+
  geom_point(aes(color = `GO Category`), alpha = 3.0)+
  geom_tile(aes(width = Inf, fill = `GO Category`), alpha = 0.4)+
  scale_fill_manual(values = c("darkcyan", "darkseagreen1", "lightgoldenrod"))+
  theme(panel.background = element_blank())+
  xlab("Gene Count")+
  ylab("GO Term")+
  theme(axis.line.x = element_line(color="black", size = 0.5),
        axis.line.y = element_line(color="black", size = 0.5),
        axis.text.y = element_text(angle = 0, size = 15),
        axis.text.x = element_text(angle = 0, size = 15),
        axis.title=element_text(size=14,face="bold"),
        legend.text=element_text(size=12),
        legend.title = element_text(size=14))

ggsave("GO5Bubbleplot.png", width = 10, height = 8, dpi = 400, limitsize = FALSE)
ADD REPLY
2
Entering edit mode

Thank you, really appreciate it. So the problem was with the GO_Term column not getting re-leveled. That old code of mine here is clunky. Sorry about that.

Here's how you could do this now (see code snippet below). Please note that I renamed the input data in your columns to avoid having to use backticks while referencing them (so I replaced the whitespaces with underscores).

I tested it on my R installation (version 4.1.1) and it worked properly. Let me know if this works for you also.

#OP's data.
Bubble_plot_5 <- structure(list(GO_Term = c("Translation", "Signal transduction", "Regulation of transcription, DNA-templated", "Regulation of cell shape", 
                                            "Phosphorelay signal transduction system", "Peptidoglycan biosynthetic process", 
                                            "Mo-molybdopterin cofactor biosynthetic process", "Methylation", "Intracellular protein transmembrane transport", 
                                            "Fatty acid biosynthetic process", "Electron transport chain", "Chemotaxis", "Cell wall organization", 
                                            "Cell division", "Carbohydrate metabolic process", "Bacterial-type flagellum dependent cell motility", 
                                            "Plasma membrane", "Integral component of membrane", "Extracellular region", "Cytoplasm", 
                                            "Bacterial-type flagellum basal body", "ATP-binding cassette (ABC) transporter complex", "Zinc ion binding", 
                                            "Transmembrane transporter activity", "rRNA binding", "Phosphorelay sensor kinase activity", 
                                            "Oxidoreductase activity", "Methyltransferase activity", "Metalloendopeptidase activity", "Metal ion Binding", 
                                            "Iron-sulfur cluster binding", "Hydrolase activity ", "Heme binding", "Electron transfer activity", 
                                            "ATPase-coupled transmembrane transporter activity", "ATPase-coupled cation transmembrane transporter activity", 
                                            "ATP binding", "4 iron, 4 sulfur cluster binding"), 
                                Gene_Count = c(27, 18, 34, 4, 29, 5, 3, 10, 1, 5, 2, 19, 8, 5, 12, 16, 107, 212, 5, 77, 16, 15, 18, 22, 18, 16, 18, 6, 5, 97, 18, 54, 12, 14, 25, 2, 114, 28), 
                                GO_Category = c("BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "BP", "CC", "CC", "CC", "CC", "CC", "CC", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF", "MF" )),
                           row.names = c(NA, -38L), class = c("tbl_df", "tbl", "data.frame" ))



#Libraries.

#Data translocation.
library(readxl)
library(writexl)

#Plotting
library(ggplot2)
library(gplots)

#Data munging.
library(forcats)
library(magrittr)
library(tidyr)
library(dplyr)


#Bubble_plot_5 <- read_excel("GO_Bubble_plot.xlsx", sheet = "GO_5")
#view(Bubble_plot_5)


#Grouping by GO_Category, 
#arranging GO_Term column in ascending order alphabetically
#and reordering that column as is.
Bubble_plot_5 %<>%
  group_by(GO_Category) %>%
  arrange(GO_Term, .by_group = TRUE) %>%
  ungroup() %>%
  mutate(GO_Term = forcats::fct_reorder(GO_Term, GO_Category))

#Plotting.
ggplot(Bubble_plot_5, mapping = aes(x = Gene_Count, 
                                    y = GO_Term, 
                                    size = Gene_Count)) + 
  geom_tile(mapping = aes(width = Inf, y = GO_Term, fill = GO_Category), alpha = 0.2) + 
  geom_point(mapping = aes(color = GO_Category), alpha = 4.0) + 
  scale_fill_manual(values = c("darkcyan", "darkseagreen1", "lightgoldenrod")) +
  scale_color_manual(values = c("darkcyan", "darkseagreen1", "lightgoldenrod")) +
  theme(panel.background = element_blank()) +
  xlab("Gene Count") +
  ylab("GO Term") +
  theme(axis.line.x = element_line(color = "black", size = 0.5),
        axis.line.y = element_line(color = "black", size = 0.5),
        axis.text.y = element_text(angle = 0, size = 15),
        axis.text.x = element_text(angle = 0, size = 15),
        axis.title = element_text(size = 14, face = "bold"),
        legend.text = element_text(size = 12),
        legend.title = element_text(size = 14))
ADD REPLY
1
Entering edit mode

Thank you so much Dunois . It works completely fine now.

ADD REPLY

Login before adding your answer.

Traffic: 2779 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6