Hi all,
I am a beginner with RNA-seq analysis, and using DESeq2. The experiment design is: cells from 4 subjects were cultured and then treated with a small molecule. I wish to perform DE between the control & treated conditions, while the subjects would be replicates. The formula I am using is ~ subject + treatment.
When I plot the principal components PC1 vs PC2, the samples separate by subject in PC1. Similar trend in seen upto PC3. However, when I plot PC1 vs PC4, I can see that PC4 separates the samples by treatment.
How do I regress out PC1 (or the subject) from the data so that I can get DE for treatment ? Also, only 7% of the variation in data is explained by PC4 (which separates the samples by treatment). Is there a metric for how much I can trust the results from this analysis ? Thank you and I apologize if this question has asked before. I don't know what terms I should be searching for.
edit: Thanks genomax for pointing out image upload.
Thanks for the reply. So the differential expression result from DESeq2 is already corrected for the effect exerted by the subject ? How should I reconcile the DESeq2 results vs the PCA output ?
There's nothing to reconcile. Your samples are rather different, your treatment doesn't affect a whole lot of genes, but it affects some. DESeq will find them.
Oh that makes sense. I checked the results (p.adj < 0.05) and I have ~ 1100 genes that are significant for DE, but the log2FC range is only between -1.8 to +2.0. Thanks for helping me out !