Hello everyone,
I have some basic questions regarding the differential expression results obtained from nuclear counts and unspliced counts in single-nucleus data.
For instance, certain groups of gene expressions are upregulated when computing DEGs using nuclear counts, while the same gene expressions are downregulated when using unspliced counts. I am curious about the biological meaning behind this discrepancy.
From my understanding, nuclear counts primarily capture the transcriptional dynamics, whereas unspliced counts represent changes in pre-mRNA levels.
I would greatly appreciate your advice .
Thank you, Akila
I am not familiar with single-nucleus experiments, but I can try from a general biological standpoint. As I understand it, nuclear counts simply means RNA found in the nucleus, while unspliced counts would be pre-mrnas, as mentioned. I also assume that when computing DEGs, it is not done in a time dynamic where you could distinguish different phases of the lifecycle of RNA.
In general, I would think that increase of nuclear mRNA with decrease in unspliced counts could indicate defective export of the mRNA. The decrease in unspliced counts would suggest the transcripts do get processed, at least spliced, but the accumulation of mRNA in the nucleus may account for increased nuclear counts.
In our own bulk data, we have observed an increase in steady-state RNA with a decrease in nascent-RNA for some genes, which I think is the equivalent to what you are observing. We believe this is due to defects in transcript degradation, such that even though the gene is down-regulated, the transcripts accumulate, but I don't know if this would make sense in the nucleus.
However, until I saw data suggesting otherwise, I would assume those genes are down-regulated while their transcripts accumulate for some reason.
Alternatively, mRNA processing for these transcripts could be increased, such that the pre-mrna exists for a shorter period. But this explanation doesn't seem likely to me.
I am also thinking RNA velocity (here too I am largely unfamiliar) may have some insights into what these dynamics could be. Here, considering cell-to-cell dynamics, high intronic counts would precede the accumulation of mature RNA. So, if cell 1 is in an earlier state, then for a certain gene being upregulated, you would have higher intron counts compared to mature counts. Then, if cell 2 is the later state where gene induction has decreased or started repressing, then intron counts would have a lower proportion as the mature RNA dominant. This seems similar ATpoint's answer.
Let's get some terminology straight: Let's say we have "nascent" RNA species (N) which are unspliced and "mature" species (M) which are spliced.
Nascent species (N) are produced by transcription. Nascent RNA are converted (i.e. spliced) to Mature RNA. Mature species then get degraded. You need to consider the kinetics of these processes, especially that the kinetics of RNA production can vary between cell types.
The detailed biophysical interpretations are sort of outside my field, but there is a lot of interesting work being done on this by my colleagues: https://www.nature.com/articles/s41467-022-34857-7 (See his other works too)