Question

Question: should we integrate samples from different study groups?

0

Entering edit mode

3.8 years ago

mengjieqiu1996 • 0

We want to find any differences between groups with different treatments:

Control: 2 samples Treatment 1: 2 samples Treatment 2: 2 samples

Each sample is collected from a different mouse. The first step of our analysis is clustering to find cell identities in samples. Those samples were sequenced with Hiseq. My question is: before clustering, should I pool 6 samples from 3 different conditions to remove batch effect? i.e. using harmony or whatever. Since I am think about if we only pool samples within the same study group and perform the analysis three times (for control, treatment 1 and treatment 2), will there be any significant difference on the results of clustering?

RNA-Seq scRNA-seq • 779 views

ADD COMMENT • link updated 3.8 years ago by rpolicastro 13k • written 3.8 years ago by mengjieqiu1996 • 0

score 0 · Answer 1 · 2021-02-19

0

Entering edit mode

3.8 years ago

rpolicastro 13k

The batch effect of scRNA-seq tends to be fairly pronounced, so most modern scRNA-seq software include an "integration" step to reduce this effect. In general, the samples are processed separately until the point of integration, which comes before dimension reduction and clustering. For Seurat this would be the integration vignette, and for bioconductor the MNN vignette.

ADD COMMENT • link 3.8 years ago by rpolicastro 13k

0

Entering edit mode

Thanks! Yes, my current workflow uses harmony. I pooled samples and integrated samples according the sample ID, ignoring the treatment of each sample. Thus in this way I got a unified clustering landscape across different treatments. I was asked by our biologists since they might hypothesize there should be different landscapes in different conditions. Maybe we should use the significant differences of cell identity proportions to elaborate "a different landscape"?

ADD REPLY • link 3.8 years ago by mengjieqiu1996 • 0

0

Entering edit mode

You can definitely perform differential abundance testing. Bioconductor details a method here using the negative binomial distribution. You can also answer the question with a simple monte-carlo simulation.

ADD REPLY • link 3.8 years ago by rpolicastro 13k