Hi there, I am really wrapping my head around a thing that I may have forgotten. Essentially, I have different results (i.e. rld ones) that I will use in my heat map that changes according to the number of samples I consider. I am wondering why this is happening. Given the fact I am sure I haven't explained myself clearly, I will try to paraphrase what I have just said:
I want to generate 2 heat maps: one, from the main comparison I am interested (6 samples) second one, containing results from all samples in my dataset (6 samples as before + 2)
by doing this, I obtain different counts for the same genes in the 2 aforementioned conditions. Is this due to the fact that regularised logarithmic transformation is different according to the number of samples in the dataset?
thanks
Thanks for the quick reply. I've always had this feeling! I just want to show the top variable genes in my dataset...that's it.
Another question: should I stick with the same kind of log transformation (either vst or rlog) for all of the plots in my experiment or can I change the normalisation method each time (e.g. rlog for PCA and vst for heat map?)..thanks!
I would not switch around as there should be consistency. Use what you prefer (or
vst
if you have many samples andrlog
is too slow) but do not mix at will as they behave quite differently especially for variable genes with low counts.Alternatively, what I personally find more meaningful is to show only those genes that are significantly different as high variability often comes from the mean-variance dependency for low-count genes. You could show the z-scored log2FCs for those with padj < 0.05. Still, if you prefer counts do not mix methods and be consistent.
Thank you so much it was a great help. was facing the same