Questions about detecting module related to which Group using WGCNA?
1
0
Entering edit mode
4.6 years ago
newbie ▴ 130

I have seen several posts about using trait information and detecting which module is related to which Group of trait information. I'm very confused with this.

I performed co-expression analysis with WGCNA using 15,000 protein-coding genes and lncRNAs. A total of 27 samples 15 Tumor and 12 Normal samples. Using the raw counts data I log2 transformed and used that for WGCNA.

The log2 transformed counts +1 data is in a matrix datExpr. It looks like below:

         A1BG        A2M         A2ML1       A4GALT       AAAS
Sample1 4.807355    16.45546    11.30777    9.861087    10.154818
Sample2 5.209453    15.7512     13.922956   8.434628    10.082149
Sample3 4.392317    15.96689    11.932953   9.481799    9.903882
Sample4    3        14.34721    11.558421   8.848623    10.197217
Sample5 4.954196    12.08215    5.882643    6.285402    9.005625

A total of 23 modules were detected. And using moduleEigengenes I merged the modules with similar expression profiles and finally I have 10 modules.

The trait information of my data is in a dataframe coldata looks like below:

head(coldata)

SampleName  Group
Sample1     Normal
Sample2     Tumor
Sample3     Normal
Sample4     Tumor
Sample5     Normal
Sample6     Normal

Using the above coldata information I'm interested in detecting modules that are related to Tumor class. MEs which have moduleigengene information is used for this.

moduleTraitCor = cor(MEs, coldata, use = "p")

And the output looks liked below:

enter image description here

So, then I changed the coldata like below:

SampleName  Group
Sample1     0
Sample2     1
Sample3     0
Sample4     1
Sample5     0
Sample6     0

moduleTraitCor = cor(MEs, coldata, use = "p")

And the output looked like below:

enter image description here

Questions:

1) So, what was the mistake I'm doing? Why I see NAs?

2) How do I know which are the modules related to Tumor group?

3) Once I get the correct moduleTraitCor should I use negatively/positively correlated modules for pathway analysis?

RNA-Seq R wgcna coexpression trait • 1.5k views
ADD COMMENT
0
Entering edit mode
4.6 years ago

Hello newbie,

1) So, what was the mistake I'm doing? Why I see NAs?

The SampleName probably have to be rownames of MEs

2) How do I know which are the modules related to Tumor group?

Next, you need to 'regress' the modules to your end-point, tumor group. You can use lm() or glm()

3) Once I get the correct moduleTraitCor should I use negatively/positively correlated modules for pathway analysis?

You can use both separately or combined - the meaning of the result just differs in each case.

Kevin

ADD COMMENT
0
Entering edit mode

thanks for the reply Kevin. Small question about module trait relationship.

# Set the minimum module size
minModuleSize = 50;

# Module identification using dynamic tree cut
dynamicMods = cutreeDynamic(dendro = geneTree,  method="tree", minClusterSize = minModuleSize);

#the following command gives the module labels and the size of each module. Lable 0 is reserved for unassigned genes
table(dynamicMods)

# dynamicMods
# 0    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18 
# 3714 7155 6200 1315 1001  999  855  829  641  417  366  362  325  295  245  209  206  175   76

#Plot the module assignment under the dendrogram; note: The grey color is reserved for unassigned genes
dynamicColors = labels2colors(dynamicMods)
table(dynamicColors)

# dynamicColors
# black         blue        brown         cyan        green  greenyellow         grey       grey60 
# 829         6200         1315          245          999          362         3714          175 
# lightcyan   lightgreen      magenta midnightblue         pink       purple          red       salmon 
# 206           76          417          209          641          366          855          295 
# tan    turquoise       yellow 
# 325         7155         1001

And then merged the modules like below:

MEList = moduleEigengenes(datExpr, colors = dynamicColors)
MEs = MEList$eigengenes

# Calculate dissimilarity of module eigengenes 
MEDiss = 1-cor(MEs);  

library(flashClust)
# Cluster module eigengenes 
METree = flashClust(as.dist(MEDiss), method = "average"); 

# Plot the result 
sizeGrWindow(7, 6)

plot(METree, main = "Clustering of module eigengenes", xlab = "", sub = "") 
MEDissThres = 0.2 

# Plot the cut line into the dendrogram 
abline(h=MEDissThres, col = "red")
dev.off()

# Call an automatic merging function 
merge = mergeCloseModules(datExpr, 
                          dynamicColors, cutHeight = MEDissThres, verbose = 3) 

# The merged module colors 
mergedColors = merge$colors; 

# Eigengenes of the new merged modules: 
mergedMEs = merge$newMEs; 

# Numeric module labels
moduleLabels = merge$colors

# Convert labels to colors
moduleColors = labels2colors(moduleLabels)

#Relating modules to external clinical traits
#Compute 1st principal componet of each module as its eigengenes. 
#Correlate eigengene external traits and look for the most significant associations.
#Define numbers of genes and samples

nGenes = ncol(datExpr);
nSamples = nrow(datExpr);

#calculate eigengenes (1st principal component) of modules
MEs0 = moduleEigengenes(datExpr, ?)$eigengenes

1) Now I wanted to create a heatmap with correlation values modules and trait information. Have some doubt whether here MEs0 = moduleEigengenes(datExpr, ?)$eigengenes I have to use dynamicColors which is before merging or moduleColors after merging.

2) there are few modules like turquoise and blue with more than 7000 and 6000 genes. The number is huge. How do I reduce the number?

ADD REPLY
0
Entering edit mode

1) Now I wanted to create a heatmap with correlation values modules and trait information. Have some doubt whether here MEs0 = moduleEigengenes(datExpr, ?)$eigengenes I have to use dynamicColors which is before merging or moduleColors after merging.

Hey again, I actually do not know the answer - I have not used WGCNA too much. Generally, you should go through all of the tutorials on the WGCNA web-site - then you should know the correct approach to use (?)

I mean, here, they use dynamicColors: http://pklab.med.harvard.edu/scw2014/WGCNA.html

MEList = moduleEigengenes(datExpr, colors = dynamicColors)
MEs = MEList$eigengenes
plotEigengeneNetworks(MEs, "", marDendro = c(0,4,1,2), marHeatmap = c(3,4,1,2))

2) there are few modules like turquoise and blue with more than 7000 and 6000 genes. The number is huge. How do I reduce the number?

You could go back a few steps to modify the tree cut height - this is probably the easiest way.

----------

Your input data to WGCNA should also preferably follow a normal distribution. Data like FPKM or RPKM do not work too well.

ADD REPLY

Login before adding your answer.

Traffic: 1793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6