My model explores the dynamic expression of genes during regeneration. I performed single-cell sequencing at 12 time points and annotated the cells. Some rare cell types were missing at some time points.
As shown in the figures, by calculating the gene expression and expression range of a single cell, I can show the classic expression of a single gene in a cell type from left to right via violin plots (VlnPlot()
function), and DotPlot (ggplot2
) shows its expression percentage and Z-score. Violin plots and DotPlots essentially show the same gene dynamic pattern.
Figure1 for gene1:
Figure2 for gene2:
I showed two examples of gene expression patterns that I am most interested in. The first 1-4 lines of the plot are a cell family, which we will refer to as Family A. Lines 5-8 of the plot are Family B. For the time being, we don't care how genes are dynamically expressed between cell types within a family. As shown in Figure 1, in the regeneration process from left to right, the first gene is first expressed only in Family A and then spreads to the two Families. Figure 2 is the opposite, with gene expression spreading from Family B to the two Families. How can I screen these two gene patterns that gradually spread expression between A and B families one by one across the entire genome (tens of thousands of genes)?
Moreover, the so-called cell types that temporarily "do not express" a gene are not actually 0; they just have a very low expression range or a very low expression amount. This makes the screening more difficult. It is easy for us to tell whether they are "actively expressed" with our naked eyes, but from a programming perspective, it is too complicated for someone with a biological background who can only use basic Linux and R. My data looks very noisy, so I have no idea how to automate gene screening. I know that there are currently single-cell-based time-dynamic DEG detection tools that have been published, such as TDEseq
and CASi
. But they can't find the genes I need.
Many thanks!