Hi all, I have tried using the DESeq2 plotPCA function after vst normalization of my dds object. However, the PCA plot I have obtained were samples separated on a single principal component PC1 with 100% variance explained. How is this possible and is it even appropriate to proceed with DEG analysis from here?
This is intriguing me. Is it possible that you could share your vst-normalized data that you are using for the PCA?
sure, here is the drive link to it
Thanks for the data. This doesn't look like a realistic data to me as all the genes seem (sorry I'm on mobile, so only had a quick look) to be expressed at almost the same level (9-11). Nevertheless, unless there is a distant 'outlier' in the data, I don't see how PCA can give 100% variance in first component.
There is quite a significant portion of significantly upregulated/downregulated genes after DESeq2 though... I have approximately 700+ for each category with log2FC >1.5/<-1.5
If the mean of the controls are differing from the mean of the treatments by >=1.5, you can easily get the significant genes (eg. Control is 9, treatment is 11). However that doesn't justify why all genes are expressed at almost same 'high' level.
Is this the full data? If yes, which genome is this as human genome contains >50k (all kinds of) genes
Yes, its the full data. I'm working on yeast