Hello everyone!
I am using DESeq2 for RNA-seq of 6 samples(3 replicates per condition). I have normalized the counts and plotted them using PCA and correlation heatmap to see whether these replicates are similar before proceeding to differential gene analysis. However, I am not sure whether I should exclude Mock2 replicate (see links below) because it seems that this sample is not like the other two samples in terms of its gene expression. Yet, Mock2 differs from Mock 1 and 3 especially in the 2nd principal component not the first which explains most of the variance. To my simple understanding, this means that It's not a big problem to keep mock2. My questions are:
1) should I keep mock2 or remove it in subsequent analyses as it potentially could skew the results?
2) If I should remove it, then I would end up with only 2 replicates for one condition vs 3 for the other, could that also skew the results?
Thank you very much in advance for your help!
If you complete analysis, how many DEGs show up? Since PC1 is nicely separating the samples by condition, and explains 82% of variance in the data, I suspect you will get a generous number of DEGs if there is an appreciable difference between the samples.
Thank you very much for your help! much appreciated!