Hi,
I have RNA-seq data that I would like to visualise with a PCA plot and a heatmap. I am wondering whether I should use normalised or log transformed normalised counts for this.
I have generated TMM-normalised counts per million in EdgeR as follows:
y <- calcNormFactors(y)
tmm <- edgeR::cpm(y)
I have also generated log2 transformed normalised TMM CPM:
tmm_log <- edgeR::cpm(y, log = T, prior.count = 1)
I am wondering whether it is best to use just the normalised CPMs, or the log-transformed normalised CPMs for a PCA plot and heatmap. I find that the plots look better when I use log-transformed normalised counts, but I am not sure whether this is the correct approach.
Could someone please explain why you would/would not want to use log counts?
Many thanks,
Lucy
Thank you, I am currently scaling by row using the heatmap.2 function from the gplots package. Is this an acceptable way to do the scaling?
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized. This comment should go under @ATPoint's answer.SUBMIT ANSWER
is for new answers to original question.Without code I cannot comment.
heatmap.2(tmm_log, trace = "none", col = bluered(20), scale = "row")