I am performing automatic network construction using WGCNA, exactly as given in the tutorial. I could retrieve all the modules. However, I am getting an error when I am plotting the TOM. When I do for a random subset of genes, I am able to generate a heatmap. Is it a memory problem ? Kindly help me out with this.. Thank You
> dissTOM = 1-TOMsimilarityFromExpr(datExpr, power = 6);
>Rough guide to maximum array size: about 46000 x 46000 array of doubles..
>TOM calculation: adjacency..
..will use 32 parallel threads.
Fraction of slow calculations: 0.000000
..connectivity..
..matrix multiply..
..normalize..
..done.
> plotTOM = dissTOM^7;
> diag(plotTOM) = NA;
> sizeGrWindow(9,9)
> TOMplot(plotTOM, geneTree, moduleColors, main = "Network heatmap plot, all genes"
+ )
Error in .heatmap(as.matrix(dissim), Rowv = as.dendrogram(dendro, hang = 0.1), :
row dendrogram ordering gave index of wrong length
Were you able to solve this?
I had the same error, maybe what worked for me will work for you. I suspected it may be due to input size and memory issues, and reduced my input from 35 observations of ~8000 genes to 15 obs of about 2500 just to test it out. The smaller data set plotted perfectly just like in the tutorial. I have no idea why this would produce this particular error though.
Hi,
Indeed You are right. I don't know, why the WGCNA guys haven't fixed this problem.
Thank you
@Ross Campbell: But how do you reduce the list?
I just opened it in Excel and chopped off a bunch of columns and about half the rows. Obviously that messes up the data, I just did that to test the idea. I'm going to try running the complete data set on a high-performance cluster hopefully later today.
Hi,
Isn't there any better method like selecting the rows with most significant differences or something like that?
The best method would be to keep all the data I believe. WGCNA is designed to be an unsupervised process, so selecting rows with threshold differences would potentially skew the clustering process. It would be better to keep all the data intact and run the full data set on a higher power computer. I just trimmed mine for troubleshooting the error and debugging code on my local machine. After that I ran the full data set on a server and it worked fine.