Question

TOM matrix generated by WGCNA package in R

0

Entering edit mode

2.8 years ago

Dude • 0

Hi! Everyone, I have a problem when running the WGCNA code in R. The TOM matrix yielded by the function "TOMsimilarityFromExpr " is filled with NA value. Why did this happen?? I would appretiate it if there is anyone could help me with this! Thank you!! 🙏 the code and results are as follows:

>  dissTOM = 1-TOMsimilarityFromExpr(datExpr, power = 8); 
> dissTOM[1:6,1:6]
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    0   NA   NA   NA   NA   NA
[2,]   NA    0   NA   NA   NA   NA
[3,]   NA   NA    0   NA   NA   NA
[4,]   NA   NA   NA    0   NA   NA
[5,]   NA   NA   NA   NA    0   NA
[6,]   NA   NA   NA   NA   NA    0

> TOM = TOMsimilarityFromExpr(datExpr, power = 8)
> TOM[1:6,1:6]
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1   NA   NA   NA   NA   NA
[2,]   NA    1   NA   NA   NA   NA
[3,]   NA   NA    1   NA   NA   NA
[4,]   NA   NA   NA    1   NA   NA
[5,]   NA   NA   NA   NA    1   NA
[6,]   NA   NA   NA   NA   NA    1

WGCNA • 2.6k views

ADD COMMENT • link updated 2.7 years ago by andres.firrincieli 3.8k • written 2.8 years ago by Dude • 0

0

Entering edit mode

what is the output of dim(datExpr)?

ADD REPLY • link 2.8 years ago by andres.firrincieli 3.8k

0

Entering edit mode

It's a gene expression set. 

> class(datExpr)
[1] "matrix" "array" 
> dim(datExpr)
[1]    95 19206

> datExpr[1:4,1:4]
             ENSG00000000003 ENSG00000000005 ENSG00000000419 ENSG00000000457
TCGA-2Y-A9H4        39.95787     0.000000000        27.44848        1.323982
TCGA-5C-AAPD        26.10105     0.000000000        24.81812        1.587723
TCGA-BC-A10W        13.57470     0.009521119        27.08928        7.692339
TCGA-BW-A5NP        28.06425     0.022192741        18.36840        4.594120

ADD REPLY • link 2.8 years ago by Dude • 0

0

Entering edit mode

does this happen also with the functions adjacency?

Without the datxExpr is difficult to understand what is going on. Would you mind to share the matrix? You can change the name of samples

ADD REPLY • link 2.8 years ago by andres.firrincieli 3.8k

0

Entering edit mode

Hello! I just tried the function "adjacency". And it seems that it takes a much longer time than it did without adjacency(like hours). Still not so sure if it's gonna work.

>   adja <- adjacency(datExpr,power = 8)
>   dissTOM = 1-TOMsimilarityFromExpr(adja, power = 8); 
TOM calculation: adjacency..
..will not use multithreading.

The datExpr file is included in the link, thank you for your time and attention.

https://github.com/Datapioneer/QUESTION/tree/main

ADD REPLY • link 2.7 years ago by Dude • 0

1

Entering edit mode

thanks for the file. TOMsimilarityFromExpr doesn't use the adjacency matrix as input. Use TOMsimilarity

TOM = TOMsimilarity(adjacency)

ADD REPLY • link 2.7 years ago by andres.firrincieli 3.8k

score 2 · Accepted Answer · 2022-02-24

Apparently the NaN in TOM are introduced because you have 373 genes with too many zero:

datExpr0 <- read_csv("D:/Download/datExpr.csv")
datExpr0<-data.frame(datExpr0, row.names = 1)

gsg = goodSamplesGenes(datExpr0, verbose = 3);
# Flagging genes and samples with too many missing values...
#  ..step 1
#  ..Excluding 373 genes from the calculation due to too many missing samples or zero variance.
#  ..step 2
gsg$allOK
# [1] FALSE

Remove offending genes

if (!gsg$allOK)
{
    # Optionally, print the gene and sample names that were removed:
    if (sum(!gsg$goodGenes)>0)
        printFlush(paste("Removing genes:", paste(names(datExpr0)[!gsg$goodGenes], collapse = ", ")));
    if (sum(!gsg$goodSamples)>0)
        printFlush(paste("Removing samples:", paste(rownames(datExpr0)[!gsg$goodSamples], collapse = ", ")));
    # Remove the offending genes and samples from the data:
    datExpr = datExpr0[gsg$goodSamples, gsg$goodGenes]
}

Calculate TOM

TOM = TOMsimilarityFromExpr(datExpr, power = 8)