Question

Seurat: Assay to use for differential expression in an integrated data

0

Entering edit mode

4.2 years ago

firestar ★ 1.7k

I have a single-cell RNASeq Seurat object integrated using sctransform.

When running FindMarkers(), which assay is to be used? RNA, SCT or integrated? I assume the slot is always "data".

To look at some real data. Running DGE.

f1 <- FindMarkers(obj,ident.1 = "micro-wt-lps-01",ident.2="micro-wt-lps-05",
                  group.by="cell_type_condition_cluster",slot="data",assay="RNA")
f2 <- FindMarkers(obj,ident.1 = "micro-wt-lps-01",ident.2="micro-wt-lps-05",
                  group.by="cell_type_condition_cluster",slot="data",assay="SCT")
f3 <- FindMarkers(obj,ident.1 = "micro-wt-lps-01",ident.2="micro-wt-lps-05",
                  group.by="cell_type_condition_cluster",slot="data",assay="integrated")

The results and DEGs are all different.

> head(f1)
p_val   avg_logFC pct.1 pct.2    p_val_adj
Mmp12  8.656001e-23 -17.9039715 0.001 0.111 1.417853e-18
Cxcl2  2.153860e-18 -14.1067209 0.002 0.111 3.528023e-14
Xylt1  6.461056e-17  -0.6137400 0.010 0.222 1.058321e-12
Rhebl1 5.547881e-15  -0.3580077 0.018 0.278 9.087429e-11
Lpl    1.582957e-14  57.4116369 0.047 0.444 2.592883e-10
Tgfb2  1.202214e-13  -1.4634266 0.007 0.167 1.969227e-09
> head(f2)
p_val  avg_logFC pct.1 pct.2    p_val_adj
Hp      7.286762e-30 -0.2764460 0.011 0.333 1.193572e-25
S100a11 9.700438e-29 -0.2758875 0.012 0.333 1.588932e-24
Ifitm2  9.700438e-29 -0.2758875 0.012 0.333 1.588932e-24
Ccr2    9.700438e-29 -0.2758875 0.012 0.333 1.588932e-24
Ahnak   9.700438e-29 -0.2758875 0.012 0.333 1.588932e-24
Cybb    1.039943e-27 -0.4562356 0.018 0.389 1.703426e-23
> head(f3)
p_val  avg_logFC pct.1 pct.2    p_val_adj
Gas2l3  3.419395e-11  1.7086466 0.126 0.944 1.025819e-07
Cxcr4   2.159168e-09 -0.3924339 0.022 0.500 6.477503e-06
Kif2c   3.049359e-09 -0.5152160 0.119 0.722 9.148077e-06
Rps18   6.295892e-09 -0.8022487 0.034 0.167 1.888768e-05
mt-Atp6 2.352245e-08 -0.5896836 0.167 0.778 7.056735e-05
Rps8    2.648383e-08 -0.6596474 0.018 0.167 7.945150e-05

I picked two genes which were found in all three results to show how the fold change varies.

> f1[c("Cxcl2","Lpl"),]
             p_val avg_logFC pct.1 pct.2    p_val_adj
Cxcl2 2.153860e-18 -14.10672 0.002 0.111 3.528023e-14
Lpl   1.582957e-14  57.41164 0.047 0.444 2.592883e-10
> f2[c("Cxcl2","Lpl"),]
             p_val  avg_logFC pct.1 pct.2    p_val_adj
Cxcl2 2.060274e-18 -0.3262467 0.002 0.111 3.374729e-14
Lpl   2.822407e-14 -1.3114768 0.038 0.389 4.623102e-10
> f3[c("Cxcl2","Lpl"),]
           p_val  avg_logFC pct.1 pct.2 p_val_adj
Cxcl2 0.00328671 -27.233063 0.050 0.167         1
Lpl   0.44693810  -5.408191 0.117 0.556         1

seurat 10x single-cell RNA-Seq • 7.9k views

ADD COMMENT • link 4.2 years ago by firestar ★ 1.7k

score 2 · Answer 1 · 2021-02-17

2

Entering edit mode

4.2 years ago

rpolicastro 13k

You shouldn't use SCT and integrated counts for anything outside of dimension reduction and clustering. Before differential expression you can run NormalizeCounts on the RNA assay.

Relevant Seurat links:

ADD COMMENT • link 4.2 years ago by rpolicastro 13k

0

Entering edit mode

Thanks for the reply and links. By NormalizeCounts, I guess you mean NormalizeData. Also, why normalised data? Shouldn't I be using raw counts with covariates?

ADD REPLY • link 4.2 years ago by firestar ★ 1.7k