I would like to get sample distances between different samples of an RNA-seq experiment. Read that VST and rlog function of DEseq R package were good to make a correction so that standard deviation of expression of a gene across all samples doesn't change with the mean (of expression of that gene across all samples). My questions are:
1 - Should these corrections be applied after normalising raw counts for sequencing depth (with the DESeq() function) or directly applied on the raw data?
2 - To do a heatmap with a dendrogram representing the distances between samples, is it better to plot in a heatmap the values corrected with VST/rlog or FPKM values?
3 - 'VST' method seems to be better for big sets (n>30). I have 3 samples, so that means need to choose 'rlog' instead?
4 - In both methods we can set parameter 'blind'. Should I set it to 'TRUE' or 'FALSE' in which situations?
Regards.
Thank you. Noticed now that, if we apply VST on the raw values :
gives the same results when applying to normalised values:
And in section 4.2. of this tutorial it seems it's applied to data before DESeq() is applied, so maybe it does not matter if it's normalised or not?
Right, as: