Hi all,
Curious if anyone can explain a couple disparities in hub gene selection according to connectivity in WGCNA. I regularly make kWithin versus kME plots as a diagnostic tool when constructing networks in WGCNA, wherein I also pick out each module's hub gene, selected using chooseTopHubInEachModule()
.
Usually this creates a plot showing a strong positive correlation between kWithin and kME (which should be the case) and where the hub gene (larger, colored point in each plot) from chooseTopHub is among the most strongly connected genes on both axes (but not always itself the strongest; see attached plot)
I've noticed two unexpected patterns from this, however:
- Sometimes, chooseTopHub is pretty low in terms of both kME and kWithin (see 'blue' hub in module 2; top row, center column)
- And sometimes, kME and kWithin are not correlated whatsoever (see module 6; center row, right column with red hub gene)
Any ideas why intramodular connectivity (kWithin) would be out of whack with eigengene connectivity (kME) for module 6? Also, if chooseTopHub()
is selecting hubs based on which genes has the highest total connectivity to all genes across all modules, why don't those hubs have the highest kWithin values in their respective modules? I've repeated these plots with kTotal instead of kWithin and the patterns remain the same.
Thanks for reading.
WGCNA script is below:
net_allClusters_noPAM <- blockwiseModules(datExpr_allClusters_sub,power = 9, corType = "bicor", deepSplit = 2,
networkType = "signed", mergeCutHeight = 0.2, maxPOutliers = 0.1,
TOMType = "signed",minModuleSize = 20,
numericLabels = T, pamStage = F,
pamRespectsDendro = T, saveTOMs = T,
saveTOMFileBase = "allClusters_TOM_noPAM",
verbose = 3, maxBlockSize = 7000)
MEs_net_allClusters = net_allClusters_noPAM$MEs
modulecolors_allClusters_noPAM <- labels2colors(net_allClusters_noPAM$colors)
hubs_allClusters <- as.data.frame(chooseTopHubInEachModule(datExpr_allClusters_sub,modulecolors_allClusters_noPAM,power=9))
connectivity_allClusters <- intramodularConnectivity.fromExpr(datExpr_allClusters_sub, modulecolors_allClusters_noPAM,
corFnc = "bicor", corOptions = "use = 'p'",
weights = NULL,
distFnc = "dist", distOptions = "method = 'euclidean'",
networkType = "signed", power = 9,
scaleByMax = FALSE, # can set this to T, but scaled kWithin is easy to get outside of the function
ignoreColors = if (is.numeric(colors)) 0 else "grey",
getWholeNetworkConnectivity = TRUE)
rownames(connectivity_allClusters) <- colnames(datExpr_allClusters_sub)
connectivity_allClusters$module_color <- modulecolors_allClusters_noPAM
connectivity_allClusters$gene <- row.names(connectivity_allClusters)
connect <- connectivity_allClusters[order(connectivity_allClusters$module_color,-connectivity_allClusters$kWithin),]
allClusters_kME <- signedKME(datExpr_allClusters_sub,MEs_net_allClusters,corFnc = "bicor")
allClusters_kME_kWithin <- left_join(connect,allClusters_kME, by="gene")
EDIT: Added this is a comment, but one of the questions I've raised is answered by simply specifying corFnc during chooseTopHubInEachModule. Disparities in hub gene selection between methods are resolved when the same type of correlation is used.
Just curious, for the modules which lack this correlation between the kWithin and kME values, how is the overall strength and connectivity compared to other modules which does have this correlation ?
Somewhat weaker, but connectivity (kWithin) should go down in smaller modules like module 6. What is interesting is that the range of kME values for that same module are lower when correlated against its own module eigengene vs when correlated against a different module (module 1 in this case). So expression profiles of member genes in module 6 are, overall, more similar to the module 1 eigengene?