WGCNA: problem with how to avoid huge modules
1
1
Entering edit mode
22 months ago
cao510927 ▴ 40

I am working on WGCNA for DNA methylation EPIC array data. To do co-methylation module detection, I selected the top 400,000 CpG probes with high variance across 71 samples. But, I always get a big module which will lead to the cluster dendrogram looking so weird... I don't know what happened to my analysis. I'm new to this kind of analysis... Any suggestion would be appreciated. enter image description here

Code:

bwnet = blockwiseModules(datMethy,corType="pearson",maxBlockSize=12000, deepSplit=2,
                         networkType="signed", power=12, minModuleSize=30, reassignThreshold=0, mergeCutHeight=0.25,numericLabels=TRUE, saveTOMs=TRUE, pamRespectsDendro=FALSE, saveTOMFileBase="methyTOM", verbose = 3)
WGCNA • 1.4k views
ADD COMMENT
1
Entering edit mode

What does a basic PCA of these probes look like (both the 71 samples and the 400k probes)? I have seen this numerous times; and sometimes it can be explained by strong and widespread differential methylation between conditions (i.e., treatment with a DNMT inhibitor, or a batch effect); and sometimes it's inexplicable.

In the former case (or in any case where PCA reveals groups) you can either (1) compute TOMs within group and make a consensus or (2) compute networks within groups and compare. (1) is for when the grouping is coincidental or not informative for your questions; and (2) is for when the sample grouping is of biological interest.

In the "inexplicable" cases I have had success "converting" the above kind of dendrogram via so-called "robust" WGCNA, wherein you would draw 10-20 bootstraps of your 71 samples (generating 10-20 TOMs), and then call the consensusTOM function to merge the results together.

ADD REPLY
0
Entering edit mode

enter image description here

Hi LChart, Thank you very much for your helpful advice and I'm sorry for the late reply. From the PCA plot, it is not obvious that the samples from the two conditions have obvious grouping. But the cumulative proportion of variance explained by my top 2 components is only 25.9% (5.44%+20.46%)... Under this condition, should I use the second advice (robust WGCNA), right? Thanks,

ADD REPLY
0
Entering edit mode

Can you show the other side of the singular decomposition (i.e., the gene loadings)?

ADD REPLY
0
Entering edit mode

enter image description here

enter image description here

ADD REPLY
0
Entering edit mode

The loadings will be a (num_probe, num_components) matrix, and can be obtained either by running svd on the scaled methylation data and taking the left components, or by multiplying the scaled methylation data with your (num_sample, num_components) ("right singular vectors") matrix.

ADD REPLY
2
Entering edit mode
22 months ago
cao510927 ▴ 40

Hi LChart,

I'm sorry... I made a stupid mistake here. I run the 400k probes in the block-wise way, so it is impossible to create a single gene dendrogram that combines information from multiple blocks. And I just realized that the plot I showed here is just for block 1... But thanks anyway for your advice and help. Have a good day!

ADD COMMENT

Login before adding your answer.

Traffic: 1801 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6