Gene expression heatmap following pseudotime analysis
1
0
Entering edit mode
3.7 years ago
Vitis ★ 2.6k

Going through some of the instructions and tutorials for single cell mRNA-Seq analysis, I was looking for some solid and good looking gene expression heatmap plotting following pseudotime, where the gradual shift of transient expression could be clearly shown. Since monocle3 retired the plot_pseudotime_heatmap function, I haven't found any good examples of such function implemented anywhere. Maybe my sources are limited, here are some of the tutorials I've gone through:

https://broadinstitute.github.io/2019_scWorkshop/functional-pseudotime-analysis.html#plots-of-gene-expression-over-time

https://scrnaseq-course.cog.sanger.ac.uk/website/biological-analysis.html#expression-of-genes-through-time

https://satijalab.org/seurat/articles/get_started.html

https://cole-trapnell-lab.github.io/monocle3/docs/starting/

Do we only have the option to draw such heatmaps by custom code now? In R I could successfully obtain the feature (gene) expression matrix by pseudotime, but I had a hard time to order the features in a good way to show the desired effect of shifting expression through pseudotime, other than the hcluster process applied to the features. If anyone have any good suggestion on this, it would be great!

Here is an example of a similar heatmap. But this is from ArchR and single cell ATAC data. Is there something similar in single cell RNA analysis?

https://www.archrproject.com/bookdown/images/HemeWalkthrough/PNG/Plot-LymphoidU-Traj-Heatmaps_4.png

RNA-Seq • 5.6k views
ADD COMMENT
0
Entering edit mode

Can you provide an example of what you want the heatmap to look like?

ADD REPLY
0
Entering edit mode

I updated my original question with an example plot.

ADD REPLY
0
Entering edit mode

I also want a similar thing. Did you find any solution? Also could you please tell me how you obtained the feature (gene) expression matrix by pseudotime? I would really appreciate it if you share command(s) with me. Thanks in advance.

ADD REPLY
1
Entering edit mode

I've worked with a combination of R package Seurat and Monocle3. Following the instruction here (https://satijalab.org/signac/articles/monocle.html), we could obtain the trajectory inferred from Monocle3. With the new metadata column assigned with pseudotime, essentially every cell now has an additional identifier in the form of pseudotime values. We could access this additional metadata pseudotime by seurat.obj$traj.name.

Then we could obtain the expression (scaled from Seurat) matrix with "GetAssayData" function, keeping the column names (cell IDs), then replace the cell IDs with the pseudotime values. At this point, we have a expression matrix with features in rows and cells in columns (but in the form of pseudotime values), for which we could order the columns based on pseudotime. Now it's ready to plot the desired heatmap with features in rows and cells ordered in pseudotime in columns. I'll leave the detailed R code, as this would be a good practice to know the structure of a seurat and monocle cds object.

But the sticky point is how you order the features in rows. Looks like there is no good way other than simple hierarchical clustering for a better illustration of transient expression following pseudotime.

ADD REPLY
1
Entering edit mode
3.7 years ago

Generating some example data.

library("tidyverse")

dat <- map(
  set_names(seq_len(50), sprintf("ENSG%06d", seq_len(50))),
  ~rbinom(sample(seq(20, 100), 1), sample(seq(10, 100), 1), 0.5)
)

dat <- dat %>%
  imap(function(x, y) {table(x) %>% as_tibble %>% rename_with(.fn=~y, .cols=n)}) %>%
  reduce(full_join, on="x") %>%
  mutate(x=as.numeric(x)) %>%
  complete(x=seq(0, max(.[, 1])) %>%
  mutate(across(starts_with("ENSG"), ~replace_na(.x, 0))) %>%
  arrange(x) %>%
  pivot_longer(!x, names_to="gene", values_to="count") %>%
  pivot_wider(names_from=x, values_from=count) %>%
  column_to_rownames("gene") %>%
  as.matrix

> dat[1:10, 1:10]
           0 1 2 3 4 5 6 7 8 9
ENSG000001 0 0 0 0 1 2 4 2 4 9
ENSG000002 0 0 0 0 0 0 0 0 0 0
ENSG000003 0 0 0 0 0 0 0 0 0 0
ENSG000004 0 0 0 0 0 0 0 0 0 0
ENSG000005 0 0 0 0 0 0 0 0 0 0
ENSG000006 0 0 0 0 0 0 0 0 0 0
ENSG000007 0 0 0 0 0 0 0 0 0 0
ENSG000008 0 0 0 0 0 0 0 0 0 0
ENSG000009 0 0 0 0 0 0 0 0 0 0
ENSG000010 0 0 0 0 0 0 0 2 2 2

I would try ordering by row maxima after applying a sliding window average. I'll use a sliding window of 5 and a step of 1.

library("ComplexHeatmap")
library("zoo")

window <- 5
step <- 1

Heatmap(
   dat[order(apply(t(rollapply(t(dat), width=window, by=step, FUN=mean)), 1, which.max)), ],
   cluster_rows=FALSE, cluster_columns=FALSE,
   show_row_names=FALSE, show_column_names=FALSE
)

enter image description here

ADD COMMENT
0
Entering edit mode

It's a brilliant solution! So it basically smooths the expression across features, then sorts the features based on the smoothed averages?

ADD REPLY
1
Entering edit mode

Yep, the code is essentially calculating a smoothed average, and then sorting the genes based on how long in pseudotime before the maximum average is reached.

ADD REPLY

Login before adding your answer.

Traffic: 1737 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6