Dear all,
I have about 6400 genes and 35 samples. I would like to find correlated modules with some traits that are a combination of quantitative and categorical traits through running WGCNA. Categorical traits are coded as 1, 2, 3, 4. I played around with factors such as mergeCutHeight and type of correlation. However, I am not sure about them.
It is a sample of the cluster dendrogram that the corresponding modules (19 modules) obtained via this code:
net = blockwiseModules(variable, power = 16, networkType = "signed", corType = "pearson", maxBlockSize=8000, TOMType = "signed", minModuleSize = 40,reassignThreshold = 0, mergeCutHeight = 0.35,numericLabels = TRUE, pamRespectsDendro = FALSE,saveTOMs = TRUE ,nThreads = 3, verbose = 3)
.
My questions are:
In my view, the left portion of the cluster dendrogram seem a bit strange, we have some small modules (colors). What do you think? is it need to increase min module size or increase mergeCutHeight?
In general, selecting mergeCutHeight = 0.35 is appropriate, considering that the height of the dendrogram starts from 0.7?
As far as I found, we can use both Pearson and bicor for both quantitative and categorical traits, however, bicor is more powerful than Pearson. In my work, I found that in the case of using Pearson the more modules significantly correlated with traits of interest. So, can I continue with Pearson?
- Overall, with WGCNA analysis, we try to optimize the parameters to obtain fewer modules (rather than many modules) significantly correlated with traits of interest?
Thanks
Tag andres.firrincieli and Kevin Blighe. Hope to have your helpful response and feedback!
For points 1 and 2, see Peter Langfelder's answer: https://support.bioconductor.org/p/104602/
Regarding point 3, Pearson correlation is more affected by outliers. Therefore, a single outlier sample can generate a significant correlation with traits of interest. If you have a single sample where genes are highly underexpressed or overexpressed compared to all the other samples, this outlier can create a module that significantly correlates with one or more traits of interest. The moral of the story is to always check the expression profile of the module of interest before drawing conclusions about the module-traits relationship analysis.
Thank you for your reply. I've already read that post from Peter Langfelder, however still not sure about the appropriate value for merging.
considering this dendogram that cut Height is determined at default value of 0.25 and min module size of 30, I am thinking of increasing this value to 0.4. Could you please share with me your thought?
Why don't you look at the module-trait relationship heatmap and then decide cut height parameter?
Thanks for the feedback. It is the module-trait relationship heatmap with mergeCutHeight = 0.25, minModuleSize = 30, and corType = "bicor".
would you please tell me how this heatmap can guide us to choose the appropriate parameters?
Hi seta,
Have a look at the heat map and tell me which modules show similar correlation patterns against the traits of interest.
For me are the greenyellow and salmon... maybe blue and purple (?).
so, further increasing the cut height to 0.4 could make things worst by merge modules having genes that behave quite differently with respect to your traits.
Thanks for following the issue! sorry, similar or different correlation pattern, it would not be better to merge modules with similar correlation patterns? from the correlation pattern, you mean both direction and magnitude?
It is the module-trait relationship heatmap with mergeCutHeight = 0.4, but the correlation type is "pearson".
Similar
Both direction and magnitude
Just to be clear, the Module dendrogram (MEtree) is calculated as follow:
So the distances in MEtree are a measure of dissimilarity (1-cor). If you select a
mergeCutHeight = 0.4
you are basically merging modules sharing a pearson correlation coefficient equal or above 0.6Thank you for the reply. right, you made a good point about the MEtree. in this paper, authors used mergeCutHeight= 0.7, so they merged modules with a correlation coefficient of at least 30%, it sounds very small! could you please share me your thoughts?
Keep in mind that in
blockwiseModules()
the default value ismergeCutHeight = 0.15
. So, 0.7 is too high!Right, 0.7 is too high for me. I would like to know the reason behind the selection of this value?
always-helpful Kevin is one of the authors of the mentioned paper, so he can help explain the reason behind the selection of the mergeCutHeight of 0.7.
#kevin