How to normalize and batch correct SC RNA-seq data from multiple libraries and confounded with biological differences
0
0
Entering edit mode
2.7 years ago
rebeliscu ▴ 60

I was provided with several raw 10x SC RNA-seq datasets that were produced at different time points, representing different genotypes and biological sources:

  • GENOTYPE 1 -- LITTER 1 -- BATCH 1
  • GENOTYPE 2 -- LITTER 2 -- BATCH 2
  • GENOTYPE 2 -- LITTER 3 -- BATCH 3
  • GENOTYPE 3 -- LITTER 4 -- BATCH 3
  • GENOTYPE 3 -- LITTER 5 -- BATCH 3
  • GENOTYPE 1 -- LITTER 6 -- BATCH 4
  • GENOTYPE 3 -- LITTER 7 -- BATCH 5
  • GENOTYPE 1 -- LITTER 8 -- BATCH 6

Per this paper (https://molecularneurodegeneration.biomedcentral.com/articles/10.1186/s13024-022-00517-z), it is suggested:

In cases of combining multiple batches (or conditions) of single cell data where there is batch (or condition) specific cell clustering, an elegant integration method such as Seurat/CCA [92] and harmony [98] should be used after normalization of individual datasets to minimize the batch difference.

This would seem to suggest that normalization should be performed on each batch separately, then combined to correct for batch effects. However, my understanding was that normalization of library size differences, which is invariably influenced by batch, would require the full dataset. Is this the wrong way to think about normalization?

I'm also wondering if batch correction is advised at all in this scenario, given that batch is confounded with litter and genotype.

Any insight into how I should treat this dataset in terms of preprocessing would be hugely appreciated.

Thanks in advance!

batch RNA-seq normalization preprocessing single-cell • 1.1k views
ADD COMMENT
0
Entering edit mode

this is a complicated question that several groups have studied in detail. indeed, review papers comparing different methods have, at this point, already been published (example out of a hat: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1850-9)

Best bet it to head to the literature and read so that you can pick out the solutions most likely to be effective for you. you could also look at https://www.scrna-tools.org/ and find the tools whose manuscripts you would most like to read by browsing there.

ADD REPLY
0
Entering edit mode

Thanks for the reference! I see I have some reading to do.

ADD REPLY
0
Entering edit mode

yeah sorry the answer isnt more helpful.

its sort of like being asked, "which statistical test will do the best for my data" in the sense that without being able to see those data, run descriptive statistics on them, etc. etc. etc., I can't answer the question accurately.

ADD REPLY

Login before adding your answer.

Traffic: 1669 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6