I'm using GEO2R in order to identify the DEGs, I set the significance level cut-off as 0.05. After the analysis, the log2 fc threshold was set automatically around 0.2 and 419 genes were defined as significant degs based on this threshold.
My question is; can we consider genes with such fold change as significant degs?
Given that 0.2 is a 1.15 fold change (2^0.2) you're right to be skeptical, but there's no clear answer as to the significance. Consider that this is the minimal fold change of the set of 419, so there are likely genes which change more. However the real question for significance is whether or not it's reproducible - which you may not be able to assess without doing experiments. I wouldn't pay much attention to a single gene showing this much change, but if it's part of of a larger theme of genes, or something you can build a hypothesis around and then test, it may mean something. On the other hand, the sensitivity of the assay may be low for what you're attempting to measure, or the experiment may not evoke much in the way of a transcriptional response. Either way, it's the result you're stuck with. You might consider it as a weight for how much time you want to spend generating hypotheses around your gene set, as in not too much.
There is no prescribed logFC that everyone follows. The point is to convince yourself and others - those reviewing your papers and grants - that a particular fold change is meaningful. In my experience, most people will not argue with logFC =>1, but you may get pushback for smaller numbers.
You will probably hear other opinions from people who look at it primarily from a statistical point of view. They might say that a combination of logFC and p-values could be significant even when logFC <=0.5, but I don't know (m)any biologists who readily subscribe to that idea.
Given that you have 419 candidates at 0.2, I suspect you would still have a healthy number after raising logFC to 1. If some genes with logFC <1 fit the narrative of other genes with logFC =>1, you can always bring them into the story while recognizing that their expression is below the threshold.
Actually there are only 5 genes with |logFC|> 1 and 264 genes with |logFC| >= 0.5, while all the rest of genes are located between 0.2-0.5.
However, Sir I'm a beginner so I'd really appreciate your opinion, if I continued my research while setting the threshold as 0.2 and in the end I could end up with significant results.
Would this threshold make my results doubtable or would it be acceptable ?
Thank you in advance for your help, it's much appreciated.
Since OP is using GEO2R which uses limma the appropriate function (need to run manually, GEO2R does not support) is limma::treat() rather than DESeq2 which is for counts, not arrays.
Given that 0.2 is a 1.15 fold change (2^0.2) you're right to be skeptical, but there's no clear answer as to the significance. Consider that this is the minimal fold change of the set of 419, so there are likely genes which change more. However the real question for significance is whether or not it's reproducible - which you may not be able to assess without doing experiments. I wouldn't pay much attention to a single gene showing this much change, but if it's part of of a larger theme of genes, or something you can build a hypothesis around and then test, it may mean something. On the other hand, the sensitivity of the assay may be low for what you're attempting to measure, or the experiment may not evoke much in the way of a transcriptional response. Either way, it's the result you're stuck with. You might consider it as a weight for how much time you want to spend generating hypotheses around your gene set, as in not too much.
Thank you so much for your help, it's much appreciated indeed.