WGCNA and SC-RNA Seq data
1
0
Entering edit mode
4.5 years ago
pennakiza ▴ 60

Hello,

I have a dataset of single-cell expression data (at the moment working on CD4 cells only) from 4 patients. Would 4 patients be enough to get any significant results, considering that my sample number is essentially 1200 cells?

Thank you in advance!

sc-RNA seq RNA-Seq wgcna • 4.5k views
ADD COMMENT
0
Entering edit mode

Hi Kevin,

I have filtered my dataset for low counts, so I have ended up with ~850 genes, and WGCNA runs quite smoothly but the module-trait correlations that I see are quite weak. I was wondering if that is because I am working with so few genes or because all those cells come from only 4 patients.

Penny

ADD REPLY
1
Entering edit mode

Could be a few reasons. So, you have 850 genes x ~1200 cells? I'm still not sure that WGCNA is best for scRNA-seq data, and I believe running WGCNA on PC eigenvectors would be better (as I explain in my answer, below). The cellular heterogeneity that comes with scRNA-seq datasets may be what is 'beating' WGCNA in this case, and also the fact that you are effectively dealing with 4 batches (4 samples), or have you run it on the 'integrated' dataset after adjustment for batch?

You are literally the first person that I have ever heard of using WGCNA on scRNA-seq data.

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. This comment belongs under Kevin's answer.

SUBMIT ANSWER is for new answers to original question.

ADD REPLY
0
Entering edit mode

It's enough to me. At some point, each cell can be used as a sample

ADD REPLY
0
Entering edit mode
4.5 years ago

To run WGCNA on such a dataset, you will require a lot of RAM, assuming that you want to run it over the entire transcriptome of each cell. Moreover, I question what exactly it would mean when compared to the output of other methods such as tSNE, UMAP, psuedo-time analysis, etc.

None of us can stop you going ahead with this, but I just question what exactly it would mean. The aforementioned data reduction methods were designed specifically to reduce the computational burden of processing and interpreting scRNA-seq data. Thus, it may make more sense to run WGCNA on a certain number of principal components that account for an appreciable amount of explained variation, like > 80%.

Kevin

ADD COMMENT
0
Entering edit mode

Actually, the computational expense is not that high, especially if the adjacency matrix is filtered to remove genes with a low variability and/or expression level.

The additional information would be to see the "wiring" of the gene expression network,in different clusters, and identification of potential key driver genes.

ADD REPLY
0
Entering edit mode

Of course.

ADD REPLY

Login before adding your answer.

Traffic: 2416 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6