I’m not very well versed in the bioinformatics world, but I have been looking at some gene counts for a particular gene of interest. I have 3 independent reps, the first two reps are very similar in their gene counts but the third rep is off which is throwing off the significance of results. I was told to do batch correction, however, I don’t think this is a correct thing to do. All the samples were extracted at different times, with a different batch of the extraction kit used for the third rep, however, all reps were sequenced at the same time. Can you only do batch correction if the reps were sequenced at different times? And does use of a different batch of reagents result in batch effects resulting in a need for batch correction?
Sequencing date doesn't add technical artifacts. So don't worry about that. Having different prep dates absolutely does. And you can't just magically fix that with exactly three samples of three different prep dates.
With only three samples total, you have to accept the samples as they are, unless something in the QC makes it obvious that the results of the third sample are especially unreliable; e.g the third sample has a quarter as many reads, has very bad quality scores, has far fewer mapped reads, etc.
If you'd had 6 samples, and one clear outlier, you'd have a stronger case for just dropping it, but with only three, you really can't.
Thank you, this is what I thought but others were telling me different.