I have got two sets of differentially expressed genes derived from comparing two treatments to a single control baseline, i.e Treat1 – Ctrl and Treat2 – Ctrl.
My objective is to discuss the genes common to both treatments, and the genes differentially expressed in specific treatments. Currently I've done this in a very simple way, just by calculating the intersections and differences of the DEGs. It's quite clear that the DEG lists are sensitive to the arbitrary thresholds set (p < 0.05, FC > 1.5) and there are many cases where I'm saying a gene is specific to Treat1 where the fold changes are extremely close i.e., Treat1 FC = 1.6, Treat2 FC = 1.4.
I feel like there has to be a better way than this, does anyone have any suggestions or ideas?
Simply taking/removing the intersecting genes won't be a good idea. The following discussion on computing statistic between 2 gene lists would definitely help you.
Calculating the probability of gene list overlap between an RNA seq and a ChIP-chip data set
I'm not so much trying to show that there is a significant overlap between treat1 and treat2. It is clear that there is a very strong overlap, I could supplement this with statistics but it is well known in the literature that these two treatments are related. My question is more about how to describe the genes which are not shared by both, given that in some cases the difference between them is very small.
You can also do pathway analysis with both comparisons,and see if there are common pathways enriched.
This has been done, but really my main objective is to try and identify "common treatment genes" and "treatment specific genes" and then run pathway analysis independently on these sets.
If the aim of the two treatments is same (like inhibition of proliferation ..etc), I would expect most of the genes to be up/down regulated similarly. Minor fold change differences (like the one you mentioned) are negligible considering library preparation, data generation..etc. However, we can't advice you what else could be done without giving more context into the study design, disease and any other details. I would also seek help from a core biologist to explain the two conditions.
Well the treatments activate different immune cells, and the results I find are totally expected and concordant with the literature. What I wanted to do was focus on the "clear" differences between them for the purposes of hypothesis generation. I'm just surprised that there isn't a standard procedure for this type of analysis.
There are naive ways that I could potentially do it, for instance requiring the difference in fold changes to be high such as saying a gene is specific to one treatment if its fold change doubles that in the other treatment I just didn't want to have to include another arbitrary threshold.