How to reproduce a stacked bar chart in R
1
1
Entering edit mode
4.1 years ago

Hi,

I have used the analytical tool CIBERSORTx to impute gene expression profiles and provide an estimation of the abundances of member cell types in a mixed cell population, using RNAseq and TCGA data. The output table have TCGA barcodes in rows, and cell types in columns. I would like to generate a stacked bar chart using R, like this one:

enter image description here

Any idea?

Thanks!

rna-seq R • 7.0k views
ADD COMMENT
0
Entering edit mode

Can you repost the image? It's not showing up.

ADD REPLY
0
Entering edit mode

Reposted the image, sorry!

ADD REPLY
1
Entering edit mode
4.1 years ago

Data is a data.frame that has the TCGA barcodes in a column called "barcodes", relative percentage in a column called "percent", and cell types in "cell_types."

if(!require("tidyverse")) install.packages("tidyverse")
library(ggplot2)
ggplot(data, mapping = aes(x = barcodes, y = percent, fill =  cell_types)) + 
    geom_bar(position= "stack", stat = "identity")

EDIT: In case you want to change the colors of the cell_types, you can add scale_fill_manual with labels being the names of the cell-types, and values being the color hex code you want:

 ggplot(data, mapping = aes(x = barcodes, y = percent, fill =  cell_types)) + 
        geom_bar(position= "stack", stat = "identity") +
        scale_fill_manual(labels = c("cell_type1", "cell_type2", "cell_type3"), values = c("#000000", "#000000", "#000000"))
ADD COMMENT
0
Entering edit mode

Thank you!

My problem is that I have a table data format like this one, so I dont know how to reorder in order to have the columns you said:

enter image description here

ADD REPLY
0
Entering edit mode
library(ggplot2)
set.seed(1)
test_df <- data.frame(mixture = c("TCGA.P7.ASNX", "TCGA.P7.A5NY", "TCGA.P8.A5KD",
                       "TCGA.P7.ASNN", "TCGA.P7.A5NB", "TCGA.P8.A5KE"),
           cell_type_a = runif(6),
           cell_type_b = runif(6),
           cell_type_c = runif(6))

#mixture cell_type_a cell_type_b cell_type_c
#1 TCGA.P7.ASNX   0.2655087  0.94467527   0.6870228
#2 TCGA.P7.A5NY   0.3721239  0.66079779   0.3841037
#3 TCGA.P8.A5KD   0.5728534  0.62911404   0.7698414
#4 TCGA.P7.ASNN   0.9082078  0.06178627   0.4976992
#5 TCGA.P7.A5NB   0.2016819  0.20597457   0.7176185
#6 TCGA.P8.A5KE   0.8983897  0.17655675   0.9919061


 test_df <- test_df %>% pivot_longer(!mixture, names_to = "cell_types", values_to = "percent") %>%
      rename(barcodes = mixture)

#barcodes     cell_types  percent
 #  <chr>        <chr>         <dbl>
 #1 TCGA.P7.ASNX cell_type_a  0.266 
 #2 TCGA.P7.ASNX cell_type_b  0.945 
 #3 TCGA.P7.ASNX cell_type_c  0.687 
 #4 TCGA.P7.A5NY cell_type_a  0.372 
 #5 TCGA.P7.A5NY cell_type_b  0.661 
 #6 TCGA.P7.A5NY cell_type_c  0.384 
 #7 TCGA.P8.A5KD cell_type_a  0.573 
 #8 TCGA.P8.A5KD cell_type_b  0.629 
 #9 TCGA.P8.A5KD cell_type_c  0.770 
#10 TCGA.P7.ASNN cell_type_a  0.908 
#11 TCGA.P7.ASNN cell_type_b  0.0618
#12 TCGA.P7.ASNN cell_type_c  0.498 
#13 TCGA.P7.A5NB cell_type_a  0.202 
#14 TCGA.P7.A5NB cell_type_b  0.206 
#15 TCGA.P7.A5NB cell_type_c  0.718 
#16 TCGA.P8.A5KE cell_type_a  0.898 
#17 TCGA.P8.A5KE cell_type_b  0.177 
#18 TCGA.P8.A5KE cell_type_c  0.992 

ggplot(data = test_df, mapping = aes(x = barcodes, y = percent, fill = cell_types )) + 
  geom_bar(position = "stack", stat = "identity") + 
  scale_fill_manual(labels = c("cell_type_a", "cell_type_b", "cell_type_c"), values = c("#000000",
                                                                                        "#000000",
                                                                                        "#000000"))
ADD REPLY
0
Entering edit mode

I will try it, thanks!

Challenging, cause I have to include 178 rows (barcodes) and 28 (cell types). Is there a simple code to do it?

ADD REPLY
0
Entering edit mode

What's challenging about it?

ADD REPLY
0
Entering edit mode

Yes with my low code level, sorry!

I meant I dont know how to include all the barcodes and cell types but manually.

ADD REPLY
1
Entering edit mode

Ah, you don't have to! That portion of the code when I type the cell_types and colors (scale_fill_manual) is optional! You can simply use the code above to transform your data.frame into the correct format, and then use ggplot without the scale_fill_manual:

ggplot(data = test_df, mapping = aes(x = barcodes, y = percent, fill = cell_types )) + 
  geom_bar(position = "stack", stat = "identity")

It'll automatically add colors.

ADD REPLY

Login before adding your answer.

Traffic: 2310 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6