I have gene expression values for TEST and CONTROL samples from RNA-seq. I have determined the differentially expressed genes (DEG). Now I need to prepare a co-expression network of only the DEG. I did this by calculating the Pearson Correlation Coefficient (PCC) values using the corr.test() in R and then using a suitable threshold. I need to clarify a few things here. While calculating PCC should I consider gene expression values from both TEST and CONTROL? Is it logical to build separate networks for TEST and CONTROL?
Yes, it makes sense to construct separate networks for TEST and CONTROL and to then compare these [networks]. In fact, I believe this is preferable to just constructing a network on the entire dataset and then doing post-correlations on modules, as is done in WGCNA.
But constructing only network with gene expression for single condition will give how the genes are correlated between the samples of a single condition. It won't tell much if any two genes are correlated across conditions.
Again, for any given condition the gene expression values should be same across samples/replicates, atleast theoretically. In such circumstances, the PCC ~ 0 if we try to calculate the gene-gene PCC using gene expression data from one condition.
...and what does it actually mean, biologically, if 2 genes are correlated "across conditions"? - there are different ways to interpret this definition. On the other hand what does it mean if geneX is correlated to geneY in TEST but not correlated in CONTROL? In both cases, it may mean absolutely nothing because it's just a correlation.
Look, don't think too much about network analysis - it will never help you to fully understand a disease mechanism. If you have been asked to do it by your PI, you can do it and tick it off the list [of things to do]. In network analysis, people always highlight very isolated examples as success stories, much like DTC (direct to consumer) genetic testing companies; however, most people with whom I have worked have become so worried about these analyses that they decided to not include them in results.
But constructing only network with gene expression for single condition will give how the genes are correlated between the samples of a single condition. It won't tell much if any two genes are correlated across conditions. Again, for any given condition the gene expression values should be same across samples/replicates, atleast theoretically. In such circumstances, the PCC ~ 0 if we try to calculate the gene-gene PCC using gene expression data from one condition.
...and what does it actually mean, biologically, if 2 genes are correlated "across conditions"? - there are different ways to interpret this definition. On the other hand what does it mean if geneX is correlated to geneY in
TEST
but not correlated inCONTROL
? In both cases, it may mean absolutely nothing because it's just a correlation.Look, don't think too much about network analysis - it will never help you to fully understand a disease mechanism. If you have been asked to do it by your PI, you can do it and tick it off the list [of things to do]. In network analysis, people always highlight very isolated examples as success stories, much like DTC (direct to consumer) genetic testing companies; however, most people with whom I have worked have become so worried about these analyses that they decided to not include them in results.