Hello everyone,
I'm currently working on analyzing RNAseq data from two groups of induced cell lines, specifically ETV4 and RUNX1, and I've identified genes that are upregulated and downregulated in both datasets. My objective is to demonstrate the statistical significance of this gene list, ensuring it's not merely a product of random chance.
Could anyone advise on the best statistical approach to compare these genes against a background dataset? I'm considering some tests but am uncertain which would be most appropriate for this scenario.
Any guidance or suggestions on selecting the right statistical test would be greatly appreciated. Thank you in advance for your help!
If your approach of getting DEGs is a proper and sound one (e.g. DESeq2 or limma) then I don't see the need for your approach. Using sound statistics as implemented in mentioned tools is exactly the test you need to get reliable DEGs (especially in the presence of small sample size) rather than just random events.
Thank you ATpoint for your reply, but I think I didn't explain myself clearly. Initially, I compared my two datasets to their control and identified two distinct sets of differentially expressed genes (DEGs). Then, I found genes that are both upregulated and downregulated in these DEGs. Now, I need a test to validate these shared genes to see if they have any meaningful p-value, indicating significance beyond mere chance