ssGSEA scores correlate with the number of gene counts, should I be worried?

0

Entering edit mode

3.1 years ago

ishakbishara91 • 0

I performed a Pearson correlation between the ssGSEA scores for all the 50 Hallmark pathways and the number of gene counts in my data. I noticed that most hallmark ssGSEA scores correlate with the number of gene counts.

Is there any biological reason why such correlation exists? or is it a pure technical artifact? I also noticed that a more stringent nCount filtering cut-off leads to less correlation between the nCount and pathway scores.

Note: A +/- 0.3 Pearson correlation coefficient cut-off splits red and blue points.

counts correlation confounding GSEA • 2.1k views

ADD COMMENT • link updated 3.1 years ago by igor 13k • written 3.1 years ago by ishakbishara91 • 0

0

Entering edit mode

How low are your gene counts? Are you using normalized data for ssGSEA?

ADD REPLY • link 3.1 years ago by igor 13k

0

Entering edit mode

Average gene count/cell is about 2200. I set my low cut-off for the gene count at 500. I'm using ZinbWave normalized data for calculating the ssGSEA scores.

ADD REPLY • link 3.1 years ago by ishakbishara91 • 0

0

Entering edit mode

Are you using this for single-cell data? It's designed for bulk.

ADD REPLY • link 3.1 years ago by igor 13k

0

Entering edit mode

What are you referring to by "this"?

ADD REPLY • link 3.1 years ago by ishakbishara91 • 0

0

Entering edit mode

ssGSEA analysis

ADD REPLY • link 3.1 years ago by igor 13k

0

Entering edit mode

Yes, you're correct. Although, I found some studies that ran GSEA/ssGSEA on single-cell level. My logic was that ZinbWave would correct for the sparsity pre-ssGSEA but there's no studies that benchmarked this approach that I know of.

I haven't tried tools which are specifically designed for GSE on single cell level like PAGODA2 and VISION since the literature is still scarce.

Another approach is to perform ssGSEA on pseudo-bulk by summing counts on either by cluster or by sample/cell-type. But this is a last resort since it will decrease the resolution.

What do you recommend?

ADD REPLY • link 3.1 years ago by ishakbishara91 • 0

0

Entering edit mode

If you are worried about scarcity of literature, AddModuleScore from Seurat is probably used in most papers using Seurat.

Whether you want to do this on single-cell or pseudo-bulk level depends on the exact question you are trying to answer.

ADD REPLY • link 3.1 years ago by igor 13k

Login before adding your answer.