WGCNA - problem with hclustplot (Human Chimp paper tutorial)
1
0
Entering edit mode
2.3 years ago

Hi All,

I am using the code from Human vs. Chimp Brain paper https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/HumanChimp/HumanChimpNetworkAnalysis.pdf

and I have troubles with the hclustplot1 which should result from this part of code:

par(mfrow=c(2,2),mar=c(2,2,2,2))
plot(hierTOMHuman,main="WT hippocampus",labels=F)
abline(h=.95,col="red")
plot(hierTOMChimp,main="N3-/- hippocampus",labels=F)
hclustplot1(hierTOMHuman,colorh1,title1="WT network, WT colors")
hclustplot1(hierTOMChimp,colorh1,title1="N3-/- network, WT colors")
colorh=as.character(colorh1)
colorhALL=rep("grey", length(ConnectivityChimp))
colorhALL[rest1]=as.character(colorh)

My plot is not showing the colors in either Human or Chimp, there is just empty space under the titles.... The first (main) titles and dendrograms are plotted ok, next the titles are ok but nothing happens when the colors should show. Is there a bug in this code or am I doing something wrong?

bad plot

The next plot is also not wroking well but I assume the error I get is related to memory not to the code itself (?)

TOMplot1(distTOMHuman,  hierTOMHuman , colorh)

Error: C stack usage 7955296 is too close to the limit

Beside the error, it looks like the resulting plot is not ok as well - the colors are not aligned which looks very weird:

weird image

wgcna • 1.5k views
ADD COMMENT
1
Entering edit mode

regarding hclustplot1 i think the problem is with the par(mfrow=c(2,2),mar=c(2,2,2,2)). Try to increase the margings and see if the color strips appears under each dendrogram. (how to use par mar)

Regarding the error message:

Error: C stack usage 7955296 is too close to the limit

this could be a problem caused by the function TOMplot1. Try to use TOMplot instead of TOMplot1. Keep in mind that if you have a lot of genes, it will take some time to plot the heatmap.

ADD REPLY
0
Entering edit mode
2.3 years ago

Hi Andres,

Thank you for your help! I increased the margins - tested different numbers - but it doesn't seem to change anything. The TOMplot function did not show any error but finished quite quick and only resulted with the color bars (properly aligned this time but no matrix).

TOMplot_result

I am also a bit puzzled in regard to the choice of genes (RNA Seq) for WGCNA analysis. Is it reasonable to use as many genes as possible considering the available computational power? Or is it better to limit the genes i.e. to only highly expressed/highly variated? Currently I am using 10 000 most variated genes but it looks like my computer would cope with a bigger set (M1, 16GB)

ADD COMMENT
0
Entering edit mode

Hi Anna,

as I said, TOMplot need some time to plot the heatmap. Regarding hclustplot1, try plotDendroAndColors instead:

plotDendroAndColors(hierTOMHuman, colorh1, "Modules", dendroLabels = FALSE, hang = 0.03, addGuide = TRUE,  guideHang = 0.05, main = "Gene dendrogram and module colors")

Is it reasonable to use as many genes as possible considering the available computational power?

There is no straight answer to this. I personally use edgeR::filterByExpr() to filter out low expressed genes, but in the Human Chimp tutorial they use genes having a scaled network connectivity > 0.1 (pages 16 and 19). Either you filter according to variance, or expression level or connectivity, "uniteresting genes" will be placed in the grey module or will not contribute to the module connectivity. In conclusion, even if you have enough RAM, building a WGNCA network using all the genes in the expression matrix does not make any sense to me.

ADD REPLY
0
Entering edit mode

Thank you for the clarification - I have tried to also use connectivity to further filter from the 10 000 most variated genes.

If it comes to the plot, I solved it in a weird way, I am not sure why this is the case but I changed my code this way:

    par(mfrow=c(3,2), mar=c(2,2,2,2))
    plot(hierTOMHuman,main="Pup hippocampus",labels=F)
    abline(h=.96,col="red")
    plot(hierTOMChimp,main="Adult hippocampus",labels=F)

    hclustplot1(hierTOMHuman,colorh1,title1="Pup network, pup colors")
    hclustplot1(hierTOMChimp,colorh1,title1="Adult network, pup colors")

    hclustplotn(hierTOMHuman,colorh1)
    hclustplotn(hierTOMChimp,colorh1)

Adding hclustplotn function resulted in this part of the plot that was missing. I can adjust it in Illustrator so the gap is not an issue. I will also check the method you proposed, thanks a lot!

good_plot

ADD REPLY
0
Entering edit mode

solved it in a weird way

What do you mean?

ADD REPLY
0
Entering edit mode

I mean I was not sure why exactly the function hclusplotn worked and the hclustplot1 didn't and I thought this was weird that I need to use both functions (one gives the title another the plot).

If you don't mind, I have another question, I have CMD plots for both genotypes and I am not sure if they are correct.

WT MDS plot vs Mutant MDS plot

From the plot it looks like the Turquoise module in Mutant is a kind of opposite to the same module in WIldtype but when I look at the heatmap, the expression levels are actually very similar in both genotypes. The major difference in this module is due to age. Connectivity however is different if we correlate in genotypes (Spearman corr. rho = 72).

enter image description here

So my question is am I understanding the MDS plot in a wrong way? How should I interpret the numbers on x and y axis?

ADD REPLY
0
Entering edit mode

I mean I was not sure why exactly the function hclusplotn worked and the hclustplot1 didn't and I thought this was weird that I need to use both functions (one gives the title another the plot).

Those functions are very old. They were first implemented in 2006 and used in a paper published on PNAS

From the plot it looks like the Turquoise module in Mutant is a kind of opposite to the same module in WIldtype but when I look at the heatmap, the expression levels are actually very similar in both genotypes.

The multidimensional scaling (MDS) take the dissimilarity values as input and translate them into euclidean distances. You should check instead the expression profile of the blue and brown modules. I think that the points (genes) located at the tip of the of the brown and blue modules are those causing the opposite orientation you observe between the WT and N3 network.

The major difference in this module is due to age.

That is correct.

Connectivity however is different if we correlate in genotypes (Spearman corr. rho = 72).

If you look at the top and bottom of the heatmap there are two clusters of genes that have opposite sign of expression between the WT and N3 samples.

ADD REPLY
0
Entering edit mode

Thank you for the clarifying - yes, you are right, I can see the opposite expression levels in the top and bottom of Turquoise heatmap and also when I check Blue and Brown the sign is opposite by genotype for old mice. Now I better understand why these MDS plots look so different, thanks!

ADD REPLY

Login before adding your answer.

Traffic: 2660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6