Hi Biostars ,
I was using WGCNA for co-expression analysis. I used the signed automatic network and module detection method. I have also used "bicor" for the correlation. Everything went well, but I see differences in the number of genes significantly correlated to the trait, and the number of genes in the module-Eigene gene significantly correlated to trait!
In the picture, you see module color salmon and blue have a significant correlation to GIR. These two modules together gave me 2000 genes.
However, when I saw the number of significant genes in the GSPvalue object, the number of significant genes correlated to GIR was 987.
Gene.signif_GIR <- GSPvalue %>% filterp.GS.IS < 0.05)
I don't know why that happened. I thought it should be the same.
I would appreciate your answers!
Thank you!
Hi eggerj, Thank you so much for your quick reply. I have posted the picture again for your understanding! https://imgbb.com/tLgNd87
To paraphrase my question I thought the genes in the significant modules(modules correlated to the trait) must be identical with the significant gene list you will get from the GSPvalue Thank you!
Right, that most likely won't be the case. Not all of the genes in those modules will be correlated with your trait, but a reasonable number should be. It will vary case to case. Maybe 5%, maybe 20%, maybe 30% of the genes in the significant module will individually be significant to the trait by themselves.
It's also reasonable that other genes in your network not found within those modules will have high correlations with your trait as well. Looking at your figure, I bet there are a handful of genes not in the blue or salmon module that have reasonable gene significance with GIR.
Hi eggerj,
Thank you so much for your explanations! If I understand you well, you are saying that I still may get other significant genes correlated to GIR in the non-significant modules. If that is the case, would you please one more time give me directions on how to get them?
thank you!
There are multiple ways to do this. If you're using WGCNA you should have a vector of module assignments (created from labels2colors). You could create a table or dataframe by merging them with your gene significance results (GSPvalue?) using cbind. Just make sure the genes are ordered the same. You could then subset the dataframe by significance and see the remaining module assignments.
I don't have all of your code available to see exactly what you're working with, but it should be pretty straight forward.