Question

Are tens of DEGs still biologically meaningful?

0

Entering edit mode

9 weeks ago

FedeXandeR ▴ 20

In general, when a differential expression analysis of a bulk RNA-Seq dataset returns a meager number of differentially expressed genes--let's say greater than 10 and less than 100--there is a widespread feeling of skepticism by bioinformaticians toward the reliability of the list of DEGs and/or their significance from a biological/functional point of view, mostly treating them as kind of false positives or accidental dysregulations. Assuming that my background is that of a mathematical physicist devoted to biology, thus with weak knowledge of genomics, I ask you: is there a deep reason in cell biology/physiology why a transcriptional dysregulation of a few genes should be viewed a priori with suspicion, despite being quite confident of the quality of the experimental protocol and execution of the sequencing?

Thank you in advance for your opinions!

DEG differential-expression-analysis • 593 views

ADD COMMENT • link updated 9 weeks ago by DGTool ▴ 290 • written 9 weeks ago by FedeXandeR ▴ 20

score 4 · Answer 1 · 2024-09-27

4

Entering edit mode

9 weeks ago

tothepoint ▴ 940

Assume the network diagram represents below describing multiple pathways governing important biological processes, explaining the complex interactions and potential impacts of gene expression changes.

enter image description here

Even a small number of differentially expressed genes (DEGs) identified in biological data analysis can be biologically significant and may require further investigation. It is crucial to consider the biological context, the roles of the genes in their respective pathways, and their network connectivity to fully understand the implications of their differential expression.

For example, consider the DEGs as part of a very specific biological pathway, such as the one represented by node "1" in the diagram. This gene might start a cascade of signals that impacts multiple cellular processes, as seen in the connections leading to other nodes like "3", "4", and "5". Even though the direct impact might appear confined to a few nodes, the overall effects on the biological system could be substantial.

Conversely, a node like "6" may be connected to fewer pathways or other nodes, but the effects on those specific pathways could be critically important. This demonstrates that even minimal changes at a genetic level can have profound and wide-reaching effects on the cellular or even the organism level, emphasizing the need for a nuanced interpretation of RNA-seq and other genomic data.

By using this network model, we can visually understand how interconnected and dependent biological processes are, highlighting the importance of each gene in maintaining the normal function and health of biological systems.

I hope this will answer your doubt.

ADD COMMENT • link 9 weeks ago by tothepoint ▴ 940

0

Entering edit mode

Thank you man for your kind reply. I understand that even few genes (or even one) could, in principle, induce dramatic phenotypic changes, however my concern was slightly different, and I try to rephrase it this way: is this a likely experimental scenario? In other words, after a treatment able to alter cell transcription, how often do you expect to observe just 10 to 100 dysregulated genes? Is it quite common, in your experience, or is it the exception? I would say that it heavily depends on the experiment, BUT a lot of bioinformaticians I talked with think that it’s generally unlikely, mainly because, they say, everything happens within a deeply integrated genetic transcription network, for which when you move one gene it’s very likely that you also alter the expression of many others downstream, because everything is connected, and gene networks are pervasive, and so on… So they think that when you get something in the order of tens of genes from a bulk RNA-Seq study, it’s instead likely that you’re missing something, so they start suspecting that your study is underpowered, either from the technical or the theoretical point of view. In this sense they don’t think that, e.g., 50 DEGs could be biologically meaningful, and often conclude saying something like “no relevant transcriptional effects could be observed”.

What do you think about it?

ADD REPLY • link 9 weeks ago by FedeXandeR ▴ 20

0

Entering edit mode

Personally, I feel this would not only depend on the experiment (which would be a large factor), but also what sort of thresholds and filtering you undergo, or the sorts of statistical methods you apply. If, from the experiment, you only want to look at highly confident genes (i.e. very low FDR-corrected values) or ones with a large change in expressions, then I don't think it would be too unheard of to only get very few genes to be differentially expressed (according to set thresholds). If there were also enough replicates in each condition, from which noise and wanted/unwanted variation can be estimated, it might also shrink the number of possible candidates (but with higher confidence). So overall, it really depends on what one is looking for, and unlikely to be a one answer fits all on what number of DEGs (for e.g. bulk RNA-seq) is likely/unlikely.

(Not saying there aren't time where such results might raise an eyebrow, but if experimental setup + statistical analyses looks to be sound, then sometimes it can just be like that; though investigating why/how these unexpected results came to be can also be fun)

ADD REPLY • link 9 weeks ago by DGTool ▴ 290