I have a dataset of hundreds of different samples and their normalized (by library size) feature counts.
I want to perform downstream analysis on these data, starting with PCA using prcomp
. Should I center and scale the values before PCA, or is the normalization of reads enough?
Thanks!
You can use log transformed values and also add pseudocounts to reduce the bias towards highly expressed transcripts. Always higher values (expression) dominates the variation levels between the samples than the lower values (or less expressed transcripts). If you do not use pseudocount the results will be completely based on highly expressed transcripts. This plays a major role when you do analysis on less expressed transcripts especially the non-coding RNAs along with protein coding ones.
Thanks. I wasn't referring to log-transforming, but to "scaling" and "centering" which are standarization options when performing PCA. Since I am not visualizing the expression counts, but rather the samples and their coordinates in PCA space, I don't think it makes any difference if I log-transform the data in this case.
I could not able to explain you properly but you will get better explanation here
Hello Kevin,
I really like your way of explaining and clarifying concepts. So I thought to ask you only about my issue, can you please respond me in your busy schedule Thanks .
My question is: I have 57 tumor samples, and 3 healthy samples and 3 carrier samples (as condition in Deseq colData) . I have created VSD by using vst function on DEseq object and tried to find sample mixup by using PCA .
After running PlotPCA and Pheatmap , I got PCA plot and heatmap (attached), and it showed mixup of tumor and healthy samples.
I was wondering How can I correct it m so that samples visible seperated (clustered) on PCA plot ? Thanks.
Hello kevin,
Here are the links for PCA and heamap plots:
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.SUBMIT ANSWER
is for new answers to original question.I apologize Kevin, as I'm first time user on this Biostar, I didn't know about it.