Question

How can I increase the sensitivity of WGCNA modules?

0

Entering edit mode

7.5 years ago

genomeandahalf ▴ 40

I'm running a WGCNA analysis on ~50,000 transcripts with the blockwise modules command:

modules = blockwiseModules(wgcna_data, maxBlockSize = 10000, checkMissingData=TRUE, minModuleSize = 20, deepSplit = 4, mergeCutHeight = 0.25, power = power, networkType = 'signed', replaceMissingAdjacencies=FALSE)

And I end up getting around ~250 modules of genes, with some modules containing thousands of genes. However, I'd like to get more specific modules - i.e., to break up these large modules into whatever submodules in the clustering tree might exist within them to explore their expression substructures. I attempted increasing deepSplit to 4 and mergeCutHeight to 0.25, but these parameters did not substantially increase the number of modules. Is there some way I can tell the clustering algorithm to be more stringent with its module inclusion? Would it be possible to perform a different clustering algorithm on the dendrograms which allows for more stringent module cutoffs?

WGCNA RNA co-expression RNA-Seq • 6.8k views

ADD COMMENT • link updated 7.3 years ago by Biostar 20 • written 7.5 years ago by genomeandahalf ▴ 40

0

Entering edit mode

You can specify a max number of genes per module to avoid big modules. You could also apply the same mergeCutHeight algorithm to a given module.

ADD REPLY • link 7.5 years ago by Lluís R. ★ 1.2k

0

Entering edit mode

What's the parameter for max genes per module? I don't see it in the documentation.

ADD REPLY • link 7.5 years ago by genomeandahalf ▴ 40

0

Entering edit mode

I could only find the minimum in minModuleSize, I might have confused with another option. Sorry.

ADD REPLY • link 7.5 years ago by Lluís R. ★ 1.2k

0

Entering edit mode

Late to the party unfortunately, I think you should first try to reduce the number of genes/transcripts that are given as input in the first place, removing low expressors and transcripts that do not change of the conditions. It is unlikely that 50k transcripts are acting concertedly under any single biological condition, by removing the unaffected genes, you can probably get rid of a lot of random correlations.

ADD REPLY • link 7.3 years ago by Michael 55k

2

Entering edit mode

The authors do recommend to remove the "noisier" genes, either by mean expression or variance, but they also recommmend to not filter by differential expression - see FAQ question 2.

ADD REPLY • link 7.3 years ago by h.mon 35k

0

Entering edit mode

"filter by mean expression or variance is a matter of debate" Which one is better? Or which one do you use? I usually filter genes by keep <- rowSums(cpm(DataExpr)>1) >1 Is this good enough?

ADD REPLY • link 5.1 years ago by Arindam Ghosh ▴ 530