Entering edit mode
8 months ago
tujuchuanli
▴
130
Hi,
I recently worked with the R package 'bluster' (https://bioconductor.org/packages/release/bioc/vignettes/bluster/inst/doc/clusterRows.html#graph-based-clustering). The manual taked a scRNA-seq data as example to demonstrate various clustering techniques. Initially, the authors perform PCA analysis using 'runPCA' from the scater package. I noted that the data input for 'runPCA' was log-transformed. My question is: Is log transformation necessary for analyzing all types of data, or is it specific to scRNA-seq data?
Thanks.
You probably always want to transform your data first to a log-like scale. On normal scale the variance between genes (observations, measurements...) is a linear function of the magnitude of counts. log-transformation corrects for that in large part. That is why such a transformation is standard to do. For RNA-seq/scRNA-seq that can be logcounts, logcpm, logtpm, something like that.
Thank you for your response. I have just one more question. I had several gene signature lists, each containing 50-100 genes, amounting to more than 100 lists in total. I have scored each cell in a scRNA-seq dataset, resulting in a score matrix with rows representing cells and columns for signatures. I aim to perform PCA analysis on this score matrix. My question is: Should I log-transform the score matrix before proceeding with the PCA analysis?
Thank you for pointing out my mistake; I am deeply embarrassed. This is my favorite forum, where I have found answers to many of my questions, which have been tremendously helpful. I would like to ask I should click the 'accepted' button after my questions have been answered. However, I can't find this button. Could you please tell me where it is?
Thank you and apology for my mistake again.
this is the grey check button on the left of the correct answer.
OK, Thank you to mention me. I have check the button~~.