Question

WGCNA and SC-RNA Seq data

0

Entering edit mode

4.6 years ago

pennakiza ▴ 60

Hello,

I have a dataset of single-cell expression data (at the moment working on CD4 cells only) from 4 patients. Would 4 patients be enough to get any significant results, considering that my sample number is essentially 1200 cells?

Thank you in advance!

sc-RNA seq RNA-Seq wgcna • 4.6k views

ADD COMMENT • link updated 3.6 years ago by liangqinsi ▴ 50 • written 4.6 years ago by pennakiza ▴ 60

0

Entering edit mode

Hi Kevin,

I have filtered my dataset for low counts, so I have ended up with ~850 genes, and WGCNA runs quite smoothly but the module-trait correlations that I see are quite weak. I was wondering if that is because I am working with so few genes or because all those cells come from only 4 patients.

Penny

ADD REPLY • link 4.6 years ago by pennakiza ▴ 60

1

Entering edit mode

Could be a few reasons. So, you have 850 genes x ~1200 cells? I'm still not sure that WGCNA is best for scRNA-seq data, and I believe running WGCNA on PC eigenvectors would be better (as I explain in my answer, below). The cellular heterogeneity that comes with scRNA-seq datasets may be what is 'beating' WGCNA in this case, and also the fact that you are effectively dealing with 4 batches (4 samples), or have you run it on the 'integrated' dataset after adjustment for batch?

You are literally the first person that I have ever heard of using WGCNA on scRNA-seq data.

ADD REPLY • link 4.6 years ago by Kevin Blighe 88k

0

Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. This comment belongs under Kevin's answer.

SUBMIT ANSWER is for new answers to original question.

ADD REPLY • link 4.6 years ago by GenoMax 148k

0

Entering edit mode

It's enough to me. At some point, each cell can be used as a sample

ADD REPLY • link 3.6 years ago by liangqinsi ▴ 50

score 0 · Answer 1 · 2020-05-20

0

Entering edit mode

4.6 years ago

Kevin Blighe 88k

To run WGCNA on such a dataset, you will require a lot of RAM, assuming that you want to run it over the entire transcriptome of each cell. Moreover, I question what exactly it would mean when compared to the output of other methods such as tSNE, UMAP, psuedo-time analysis, etc.

None of us can stop you going ahead with this, but I just question what exactly it would mean. The aforementioned data reduction methods were designed specifically to reduce the computational burden of processing and interpreting scRNA-seq data. Thus, it may make more sense to run WGCNA on a certain number of principal components that account for an appreciable amount of explained variation, like > 80%.

Kevin

ADD COMMENT • link 4.6 years ago by Kevin Blighe 88k

0

Entering edit mode

Actually, the computational expense is not that high, especially if the adjacency matrix is filtered to remove genes with a low variability and/or expression level.

The additional information would be to see the "wiring" of the gene expression network,in different clusters, and identification of potential key driver genes.