I am running WGCNA and trying to visualize the gene network as TOM plot. Rather than using the native function in the package, I am looking for plotting it out by ggplot. The main reason is I would like to also annotate the TOM plot by placing rectangles on each module found by the WGCNA.
However, I found a very hard time reproducing the plot of the native function, one of the main issues is when I trying to look for the order of genes by using the net$dendrograms[[1]]$order
and also the module labels by net$color
, I find that the color (or module) are not clustered together. Rather they are very scattered. Therefore, I am not sure if I can place a rectangle to bracket the modules. However, when I look at the dendrograms with module labels plotted by the native WGCNA function, the module (at least at the color level) seems clustered together. I did not use the pam options and the tree deep is set to 2, which I believe is quite conservative in defining modules.
I am confused and not sure whether I have missed something.
the module labels in net$color follows the gene order in the expression matrix:
Why?
I acknowledged that the
net$color
is following the order in the expression matrix. Thus, I have reordered thenet$color
by thenet$dendrograms[[1]]$order
. However, even after the rearrangement, the color labels are still very scattered. I can hardly find a stretch of consecutive genes that are grouped into a cluster that is more than 7/8 genes (I have already dropped cluster 0, but the modules are still interspersed).Indeed, I am looking to do two modification that seems cannot be achieved easily by the native function:
What is the output of:
head(net$color)
andhead(net$dendrograms[[1]]$order)
Something like Fig2 panels B-D-F but using the entire network TOMs from two different datasets? link
Yes, I would like to have a similar plot as B-D-F but for the whole TOM network. Indeed, I also like their color palette, thus would be great if I can know how to reproduce their figures.
I am not sure if
head()
would really give you enought information. I useTOM[net$dendrograms[[1]]$order, net$dendrograms[[1]]$order]
to rearrange the matrix. And below is an excerpt of the module annotation (after merging them), the geneorder is from dendrograms:I can hardly find a region that is not interspersed by module 0. similar results were found for the tail:
I have a solution for that but with TOMplot.
Note that
datExp1
anddatExpr2
must have the same gene orderSince you used the
blockwiseModules
function you will need to calculate again thegene tree dendrogram (geneTree)
and themodules (moduleColors)
for the dataset `datExp1'.Thanks for your suggestion, but I am afraid that does not fully solve my questions: I would like to add a rectangle on top of the heatmap to highlight the module. Another minor problem would be (although I did not mention it in my original question) I was planning to add the dendrogram for each dataset by
patchwork
or similar package, hence, I probably need photoshop the output after usingTOMplot
.More importantly, I think the core issue is unsolved: why the module labels are not clustered as the dendrogram.
This issue is automatically solved once you plot the heatmap of each TOM along with their respective gene dendrogram. Any good tool designed for heatmap will allow you to reaorder rows and columns using an external dendrogram.
So you basically want two independent TOMplot (upper and lower triangle) in a single panel. With some tweak maybe you can use the chunk of code at the end of this tutorial . I am also interested in a plot like that, so I will keep you updated.
A final note, If you can avoid the
blockwiseModules
approachSorry for the late reply. I will definately try! Thanks.
Is explained here why you should avoid the
blockwiseModules
approach.Instead, use the step by step network construction pipeline: link
I went back to the matrix and tried to plot the TOM using the native function. I think the gene (and the colors representing the modules) clustered properly. Noted that I used the same
geneTree
andmoduleColors
to get the values inputs forTOMplot
. That's why I am so confused if I am wrong if any steps, so that it is wrong when I plot them out with ggplot.I have increased the block size for
blockwiseModules
to 250000 which includes all the input genes/ features (220000). And I have also checked the outputnet
, all the genes were analyzed in a single block.