I have seen several posts about using trait information and detecting which module is related to which Group
of trait information. I'm very confused with this.
I performed co-expression analysis with WGCNA using 15,000 protein-coding genes and lncRNAs. A total of 27 samples 15 Tumor and 12 Normal samples. Using the raw counts data I log2 transformed and used that for WGCNA.
The log2 transformed counts +1 data is in a matrix datExpr
. It looks like below:
A1BG A2M A2ML1 A4GALT AAAS
Sample1 4.807355 16.45546 11.30777 9.861087 10.154818
Sample2 5.209453 15.7512 13.922956 8.434628 10.082149
Sample3 4.392317 15.96689 11.932953 9.481799 9.903882
Sample4 3 14.34721 11.558421 8.848623 10.197217
Sample5 4.954196 12.08215 5.882643 6.285402 9.005625
A total of 23 modules were detected. And using moduleEigengenes
I merged the modules with similar expression profiles and finally I have 10 modules.
The trait information of my data is in a dataframe coldata
looks like below:
head(coldata)
SampleName Group
Sample1 Normal
Sample2 Tumor
Sample3 Normal
Sample4 Tumor
Sample5 Normal
Sample6 Normal
Using the above coldata
information I'm interested in detecting modules that are related to Tumor
class. MEs
which have moduleigengene
information is used for this.
moduleTraitCor = cor(MEs, coldata, use = "p")
And the output looks liked below:
So, then I changed the coldata
like below:
SampleName Group
Sample1 0
Sample2 1
Sample3 0
Sample4 1
Sample5 0
Sample6 0
moduleTraitCor = cor(MEs, coldata, use = "p")
And the output looked like below:
Questions:
1) So, what was the mistake I'm doing? Why I see NAs
?
2) How do I know which are the modules related to Tumor group?
3) Once I get the correct moduleTraitCor
should I use negatively/positively correlated modules for pathway analysis?
thanks for the reply Kevin. Small question about module trait relationship.
1) Now I wanted to create a heatmap with correlation values modules and trait information. Have some doubt whether here
MEs0 = moduleEigengenes(datExpr, ?)$eigengenes
I have to usedynamicColors
which is before merging ormoduleColors
after merging.2) there are few modules like turquoise and blue with more than 7000 and 6000 genes. The number is huge. How do I reduce the number?
Hey again, I actually do not know the answer - I have not used WGCNA too much. Generally, you should go through all of the tutorials on the WGCNA web-site - then you should know the correct approach to use (?)
I mean, here, they use
dynamicColors
: http://pklab.med.harvard.edu/scw2014/WGCNA.htmlYou could go back a few steps to modify the tree cut height - this is probably the easiest way.
----------
Your input data to WGCNA should also preferably follow a normal distribution. Data like FPKM or RPKM do not work too well.