I have a clinical bulk RNAseq dataset with 3 different conditions/groups. I notice that if I use a standard workflow of scaling/centering the data before dimensionality reduction (PCA and tSNE), I get a messy plot of the patients. But when I log-transform the data first, the groups become distinct and tight.
Is this an artifact of my data? Is it generally better to log transform raw count data prior to scaling for dimensionality data?
Thank you for those points. Sorry, I should have clarified - I did use DESeq to normalize the data first. But if I understand your correctly, even after normalizing the data, it is pretty standard to log transform data prior to PCA/dimensionality reduction. Appreciate the clarification!