Question

Differential Expression Analysis between conditions of single cell RNA-Seq

2

Entering edit mode

6.4 years ago

hkarakurt ▴ 200

Hello everyone, I am working on scRNA-Seq data analysis and I have a technical question. We can combine different scRNA-Seq experiments with batch correction methods such as MNN or CCA. As I know, while doing differential expression analysis we should consider batch effect like Scater/Scran package provided a block parameter to do analysis with batch.

But the point is, if out data sets comes from different conditions (let's say healthy and disease) and real source of batch effect is the condition and we want to compare the transcriptomes of specific cell types between conditions, what should we do? We cannot block or do correction for batch since we want to see the effect of batch to specific conditions.

Treating scRNA-Seq data as bulk RNA-Seq data and use raw-counts (after deletion of non-expressed genes of course) with methods such as DESeq2 or edgeR, would it be okay?

Thank you in advance.

scRNA-Seq RNA-Seq differential expression analysis • 4.2k views

ADD COMMENT • link 6.4 years ago by hkarakurt ▴ 200

0

Entering edit mode

Do you have a real batch effect or are you just concerned about how to match clusters of cells between samples?

ADD REPLY • link 6.4 years ago by Devon Ryan 105k

0

Entering edit mode

I have real batch effect. I have 2 data sets for 2 conditions. Their experimental methods, platforms and tissues are same only conditions are different. I identified the cell types of clusters in data sets seperately now I want to compare specific cell types between conditions.

ADD REPLY • link 6.4 years ago by hkarakurt ▴ 200

1

Entering edit mode

I don't think it is possible to differentiate between batch effects and biological effect in this case. If you performed the experiment with biological conditions side by side in two batches, then you can try to look for batch effects.

ADD REPLY • link 6.4 years ago by Damian Kao 16k

score 2 · Accepted Answer · 2019-03-06

2

Entering edit mode

6.4 years ago

Friederike 9.0k

Butler et al actually demonstrated their original alignment method for scRNA-seq samples of different conditions. The assumption is that the majority of the transcripts will not be affected by the treatment (same assumption as for bulk RNA-seq) and that the subset of cells that do change will still be detectable after the alignment. So, yes, I would still do the alignment and then proceed with the DE analysis because how else will you identify the populations of cells that you actually want to compare (I am assuming that you're only interested in the effect of the treatment on a subset of cells because why else would you do scRNA-seq?)

ADD COMMENT • link 6.4 years ago by Friederike 9.0k

0

Entering edit mode

Thank you for your answer. I will read the article. In fact, I identified my clusters. I did it seperately for my 2 data sets (different conditions) and you are right I am interested in the effect of condition change on a subset. I did not used CCA method of Seurat, I build my pipeline on Scater package and used MNN method. I don't know about CCA but MNN does not change the counts matrix. It creates a reduced dimension matrix to use for clustering. Actually it worked very well for me in these steps. Point is, normalized counts still have batch effect but the batch is the condition I want to test. This makes me little bit confused actually.

ADD REPLY • link 6.4 years ago by hkarakurt ▴ 200

0

Entering edit mode

CCA and MNN are similar in spirit (somewhat), I only referenced the Butler et al. paper because they specifically demonstrated that the approach works (i.e. aligning the cells from different conditions while preserving the condition-induced changes).

Just as Damian pointed out, unless you have multiple replicates per treatment condition, there is no way to differentiate between the technical batch effect and the condition.

ADD REPLY • link 6.4 years ago by Friederike 9.0k

0

Entering edit mode

Actually my data is just like Damian pointed out. I have 4 data sets for each condition and they are from 2 different batches.

ADD REPLY • link 6.4 years ago by hkarakurt ▴ 200