Question

mergeCutHeight in WGCNA and correlation type for correct module identification

0

Entering edit mode

9 months ago

seta ★ 1.9k

Dear all,

I have about 6400 genes and 35 samples. I would like to find correlated modules with some traits that are a combination of quantitative and categorical traits through running WGCNA. Categorical traits are coded as 1, 2, 3, 4. I played around with factors such as mergeCutHeight and type of correlation. However, I am not sure about them.

Here

It is a sample of the cluster dendrogram that the corresponding modules (19 modules) obtained via this code:

net = blockwiseModules(variable, power = 16, networkType = "signed", corType = "pearson", maxBlockSize=8000, TOMType = "signed", minModuleSize = 40,reassignThreshold = 0, mergeCutHeight = 0.35,numericLabels = TRUE, pamRespectsDendro = FALSE,saveTOMs = TRUE ,nThreads = 3, verbose = 3).

My questions are:

In my view, the left portion of the cluster dendrogram seem a bit strange, we have some small modules (colors). What do you think? is it need to increase min module size or increase mergeCutHeight?
In general, selecting mergeCutHeight = 0.35 is appropriate, considering that the height of the dendrogram starts from 0.7?
As far as I found, we can use both Pearson and bicor for both quantitative and categorical traits, however, bicor is more powerful than Pearson. In my work, I found that in the case of using Pearson the more modules significantly correlated with traits of interest. So, can I continue with Pearson?

Overall, with WGCNA analysis, we try to optimize the parameters to obtain fewer modules (rather than many modules) significantly correlated with traits of interest?

Thanks

mergeCutHeight correlation WGCNA categorigal • 2.3k views

ADD COMMENT • link 9 months ago by seta ★ 1.9k

0

Entering edit mode

Tag andres.firrincieli and Kevin Blighe. Hope to have your helpful response and feedback!

ADD REPLY • link 9 months ago by seta ★ 1.9k

0

Entering edit mode

For points 1 and 2, see Peter Langfelder's answer: https://support.bioconductor.org/p/104602/

Regarding point 3, Pearson correlation is more affected by outliers. Therefore, a single outlier sample can generate a significant correlation with traits of interest. If you have a single sample where genes are highly underexpressed or overexpressed compared to all the other samples, this outlier can create a module that significantly correlates with one or more traits of interest. The moral of the story is to always check the expression profile of the module of interest before drawing conclusions about the module-traits relationship analysis.

ADD REPLY • link 9 months ago by andres.firrincieli 3.9k

0

Entering edit mode

Thank you for your reply. I've already read that post from Peter Langfelder, however still not sure about the appropriate value for merging.

enter image description here considering this dendogram that cut Height is determined at default value of 0.25 and min module size of 30, I am thinking of increasing this value to 0.4. Could you please share with me your thought?

ADD REPLY • link 9 months ago by seta ★ 1.9k

0

Entering edit mode

Why don't you look at the module-trait relationship heatmap and then decide cut height parameter?

ADD REPLY • link 9 months ago by andres.firrincieli 3.9k

0

Entering edit mode

Thanks for the feedback. It is the module-trait relationship heatmap with mergeCutHeight = 0.25, minModuleSize = 30, and corType = "bicor".

enter image description here

would you please tell me how this heatmap can guide us to choose the appropriate parameters?

ADD REPLY • link 9 months ago by seta ★ 1.9k

0

Entering edit mode

Hi seta,

Have a look at the heat map and tell me which modules show similar correlation patterns against the traits of interest.

For me are the greenyellow and salmon... maybe blue and purple (?).

so, further increasing the cut height to 0.4 could make things worst by merge modules having genes that behave quite differently with respect to your traits.

ADD REPLY • link 9 months ago by andres.firrincieli 3.9k

0

Entering edit mode

Thanks for following the issue! sorry, similar or different correlation pattern, it would not be better to merge modules with similar correlation patterns? from the correlation pattern, you mean both direction and magnitude?

It is the module-trait relationship heatmap with mergeCutHeight = 0.4, but the correlation type is "pearson".

enter image description here

ADD REPLY • link 9 months ago by seta ★ 1.9k

1

Entering edit mode

sorry, similar or different correlation pattern

Similar

from the correlation pattern, you mean both direction and magnitude?

Both direction and magnitude

Just to be clear, the Module dendrogram (MEtree) is calculated as follow:

MEDiss = 1-cor(MEs) # MEs are the module eigengenes
MEtree = hclust(as.dist(MEDiss), method = "average")

plot(MEtree, main = "Clustering of module eigengenes", xlab = "", sub = "")

So the distances in MEtree are a measure of dissimilarity (1-cor). If you select a mergeCutHeight = 0.4 you are basically merging modules sharing a pearson correlation coefficient equal or above 0.6

ADD REPLY • link 9 months ago by andres.firrincieli 3.9k

0

Entering edit mode

Thank you for the reply. right, you made a good point about the MEtree. in this paper, authors used mergeCutHeight= 0.7, so they merged modules with a correlation coefficient of at least 30%, it sounds very small! could you please share me your thoughts?

ADD REPLY • link 9 months ago by seta ★ 1.9k

1

Entering edit mode

Keep in mind that in blockwiseModules() the default value is mergeCutHeight = 0.15. So, 0.7 is too high!

ADD REPLY • link 9 months ago by andres.firrincieli 3.9k

0

Entering edit mode

Right, 0.7 is too high for me. I would like to know the reason behind the selection of this value?

ADD REPLY • link 9 months ago by seta ★ 1.9k

0

Entering edit mode

always-helpful Kevin is one of the authors of the mentioned paper, so he can help explain the reason behind the selection of the mergeCutHeight of 0.7.

#kevin

ADD REPLY • link 9 months ago by seta ★ 1.9k