Hi all,
I'm currently analyzing a microarray dataset using the WGCNA package. Thus far, I've generated the eigenmodules successfully, but am running into difficulty when relating the clinical traits to the eigenmodules.
When I run the moduleTraitCor command what returns is a matrix of NA values and a warning message that NAs were introduced by coercion.
nGenes = ncol(wcgna.gbmo.full)
nSamples = nrow(wcgna.gbmo.full)
MEs0 = moduleEigengenes(wcgna.gbmo.full, moduleColors)$eigengenes
MEs = orderMEs(MEs0)moduleTraitCor = cor(MEs, datTraits, use = "p")
Warning message: In storage.mode(y) <- "double" : NAs introduced by coercion`
moduleTraitPvalue = corPvalueStudent(moduleTraitCor, nSamples)
My gut tells me the problem lies in the formatting of the clinical traits matrix, but have formatted it using the WGCNA tutorial method and a more simplistic version (below) and had no luck.
traitData <- read.csv(file.choose(), header = T)
femaleSamples = rownames(wcgna.gbmo.full)
traitRows = match(femaleSamples, traitData$samp)
datTraits = traitData[traitRows, -1]
row.names(datTraits) = traitData[traitRows, 1]
collectGarbage()
traitData <- read.csv(file.choose(), header = T)
datTraits <- traitData[,2:ncol(traitData)]
row.names(datTraits) <- traitData[,1]
I've verified the row names match the sample names of the expression matrix.
> rownames(datTraits) == rownames(wcgna.gbmo.full)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[15] TRUE TRUE TRUE TRUE
I've validated that each individual sample has its own eigenmodule value for the input into the correlation.
> head(MEs)
MEsienna3 MEgreen MEsaddlebrown MEdarkgrey MEblue
GBMO1201.1 0.2178793 -0.09340675 -0.34499492 -0.24748909 -0.09830922
GBMO1201.2 0.1839224 -0.09610101 -0.06033693 0.06365605 -0.11907488
GBMO1201.3 0.1656590 -0.07976885 -0.26531313 0.01856860 -0.11678519
GBMO640.1 0.3776911 0.49972902 0.43240755 0.12641485 -0.08713975
GBMO640.2 0.3508544 0.49579921 0.46101914 -0.08089568 -0.14057770
GBMO640.3 0.4286568 0.49279581 0.18164225 -0.18477486 -0.10349883
MEdarkgreen MEviolet MElightyellow MEdarkmagenta MEbrown
GBMO1201.1 -0.1393957 -0.03123475 -0.02111663 0.3332353 -0.1850020
GBMO1201.2 -0.1976151 -0.09037365 -0.14257361 0.2781233 -0.1493471
GBMO1201.3 -0.1555266 -0.04495820 -0.09723040 0.2871780 -0.1440909
GBMO640.1 -0.3340647 -0.28128321 0.24682502 -0.2770606 -0.1429302
GBMO640.2 -0.4651146 -0.29834322 0.07151911 -0.2462520 -0.1912424
GBMO640.3 -0.3390474 -0.22246063 0.31578824 -0.2310017 -0.2118822
MEdarkolivegreen MEmagenta MEsteelblue MElightgreen MElightcyan
GBMO1201.1 -0.05952359 -0.05344344 0.23991263 0.11744582 0.22873743
GBMO1201.2 -0.07586990 -0.11740880 0.16567254 -0.00214883 0.10111378
GBMO1201.3 -0.08030249 -0.09847137 0.16942868 0.04573327 0.14793170
GBMO640.1 -0.29201956 -0.24093065 -0.09601446 -0.50334649 -0.16389318
GBMO640.2 -0.29934854 -0.22449139 -0.07995932 -0.51727444 -0.16183683
GBMO640.3 -0.27533644 -0.08892418 0.04384012 -0.28597659 0.03829304
MEcyan MEmidnightblue MEgrey
GBMO1201.1 0.5290633 0.29248900 -0.247985418
GBMO1201.2 0.4199512 0.18291670 -0.093351456
GBMO1201.3 0.4391445 0.20701941 -0.087805494
GBMO640.1 -0.2570801 -0.03445327 0.271186218
GBMO640.2 -0.2649197 -0.02110543 -0.006217014
GBMO640.3 -0.1231660 0.18228922 -0.178912316
My understanding from other related posts indicate that categorical variables work fine with this command, so I'm unsure why the list type, rather than a numeric matrix in the datTraits file, would cause a problem. Besides this, I'm at a loss for why I'm getting such an error. Here's the datTraits csv file that I'm using.
Any help would be greatly appreciated!
Thanks so much Agustin.
The ClinicalTraits data file from WGCNA tutorial was formatted like my original, so would've never figured this one out on my own.