**Hi, I am trying to do a WGCNA analysis on my bulk RNAseq data.
I have 3 cell lines, and conditions (for each cell line): control, ExposureLow, ExposureHigh; and I am also doing a surrogate variable analysis to consider as non interesting covariates.
i am interested in the Exposure effect.
A1) regarding normalization: I am trying to use voom and calcNormFactors to normalize and prepare the data for WGCNA; and I am also regressing out the effect of Line+ the SVs by calculating this beta coeficient and substracting it from the normalizes counts. I do get a very nice PCA clustering by Exposure condition. But I wonder if this way of regressing out is correct, and if I need to do other transformation before WGCNA.
A2) Regarding WGCNA, the free scale topology suggest me a power of 28 for > 0.8< and with blockwiseModules() the tree is awful, very noisy (see image). I can still play with the parameters and get some modules, even significant for my conditions, but this is ok or just means my normalization was not correct?
B1) I also tried normalization with DEseq2 but the PCA clusters are by line and not by exposure. I wonder if I can correct for the line and svs as with voom... Or if I am missing some other transformation.
B2) the scale free topology is much better, I could pick a power of 7. and blockwiseModules() are very different and the modules are then no significant for my conditions.
C) I also have TPM already from RSEM... If i start from read counts, should I still do A or B or just log transform?
Are both approaches correctly done?
Why removing genes with low counts in approach B but not in A? Should I not be using same criteria?
Thank you in advance!t**