Entering edit mode
11 months ago
Sara
▴
270
For few samples we have prepared library for single cell sequencing (10x genomics) together but they will be sequenced using the same machine but different days (3 months later). my question is , would it make a batch effect that I need to correct for it?
The sequencing process itself between different Illumina machines usually has no batch effect. That having said, look at the data. Only then you will know. See whether there is any clusters or shifts in clusters that are due to the batch. Easiest is to simply cluster and then color the UMAP / PCA / other dimreds by the sequencing batch.
True by visualization,
However, suppose that we have 2 batches that right after we project on to UMAP, we saw nice uniform distribution of cells from 2 batches, let's say they are nicely merge to the other. how can you know if they do not suffer from batch effect?
Given that you're (hopefully) going to model any batch effect in your DE analysis and that integration methods are generally mostly useful for cramming like with like for viz/embedding purposes, you'd model unwanted variation in your DE analysis as you typically would and assume the embeddings are fine.
If populations are overlapping as expected, skip integrating your samples, as it's more or less impossible to determine how well it's actually "working" on your data other than cramming similar cells together anyway.
Ok, but this is probably not due to sequencing. Libraries have probably been prepared on different days. I hope that this is balanced, meaning that every group is present in every batch. Integration deals well with the batch effect, but for DE analysis, as Jared says, you should account for that, but only possible if groups are balenced between batches. Experimental design matters.
many thanks