I am a student and am performing downstream snRNA-seq analysis (differential expression, amongst other things) on data gathered from multiple batches.
I am confused as to whether I should perform this on either raw counts (normalized and logarithmized) vs corrected counts (batch correction with harmony or scanorama for example). Generally, I believe this is done on raw counts. However, I don't understand why batch effects wouldn't skew the results? Isn't the whole point to remove the unwanted technical variation so that we can search for the biological variability, but performing downstream analysis on raw counts would include technical variability in the findings.
What am I missing?
Thanks!
Ah, I see! That makes sense, thank you! Excellent link as well.