Question

How to plot two categorical variables in ggridges?

0

Entering edit mode

14 months ago

c.e.chong ▴ 60

Hi,

I have a similar issue to this post- https://stackoverflow.com/questions/52794659/plotting-two-categorical-vectors-in-ggridges

I would like to plot a categorical variable on the x axis (system) against another categorical variable (disease type). However, I want the fluctuation of the curve to represent the proportion of isolates in my dataset that have any one particular system.

Here is an example of my dataframe. I first tried to make the proportion values whole integers, and then use the uncount function to transform the dataset as in the stackoverflow post. This however gave me a multiple curves on the same graphs, rather than one curve fluctuating over the different systems on the x-axis. I think this didn't work as in the stack overflow post the date is still numeric.

library(ggplot2)
library(ggridges)

df <- data.frame(system=c("AbiC","AbiC","AbiD","AbiD","AbiD","AbiD"), 
             proportion=c(0.520547945, 1.018220793,11.8630137,5.320813772, 14.46945338, 9.1367713),
             dataset=c("d1","d2", "d1", "d3", "d2", "d4"))

df$proportion <- as.integer(df$proportion)

df1 <- df %>%
   tidyr::uncount(proportion)

plot_ridge <- ggplot(df1, aes(x= system, y= dataset))+
      geom_density_ridges(alpha =0.5, scale =0.5, color="grey90")+
      theme_ridges()
plot_ridge

enter image description here

I'd be really grateful if anyone could help me!

R ggridges ggplot2 • 731 views

ADD COMMENT • link updated 14 months ago by benformatics 4.1k • written 14 months ago by c.e.chong ▴ 60

1

Entering edit mode

it would be easier if you drew what you want because as described it isn't super clear

ADD REPLY • link 14 months ago by benformatics 4.1k

score 0 · Answer 1 · 2023-10-25

library(ggplot2)
library(magrittr)
library(ggridges)

df <- data.frame(system=c("AbiC","AbiC","AbiD","AbiD","AbiD","AbiD"), 
                 proportion=c(0.520547945, 1.018220793,11.8630137,5.320813772, 14.46945338, 9.1367713),
                 dataset=c("d1","d2", "d1", "d3", "d2", "d4"))

## 0.5 becomes 0... multiply by 10
df$proportion <- as.integer(10*df$proportion)

df1 <- df %>%
  tidyr::uncount(proportion)

ggplot(df1, aes(x=dataset, y=..density.., fill=system)) + geom_density(position="stack")##+ facet_grid(~dataset)
ggplot(df1, aes(x=dataset, after_stat(count), fill=system)) + geom_density(aes(fill=system), position="fill")

## make your factor continuous manually - not looking great with discrete labels
df1$num <- as.numeric(gsub("d","",df1$dataset))

## plot 1
ggplot(df1, aes(x=num, y=..density.., fill=system)) + geom_density(position="stack") + scale_x_continuous(breaks = seq(length(unique(df1$dataset))), labels = sort(unique(df1$dataset))) + ggtitle('Stacked')
## plot2
ggplot(df1, aes(x=num, y=..density.., fill=system)) + geom_density(position="fill") + scale_x_continuous(breaks = seq(length(unique(df1$dataset))), labels = sort(unique(df1$dataset))) + ggtitle("Filled")

This ? The final 2 plots but the density is Abi-specific

density plot with numeric x

Maybe this for the transformed raw count

ggplot(df1, aes(x=num, after_stat(count), fill=system)) + geom_density(aes(fill=system), position="fill")
ggplot(df1, aes(x=num, after_stat(count), fill=system)) + geom_density(aes(fill=system), position="stack")

enter image description here