Hi, everyone. I want to ask "normalization". This is very confusing term for me.
I suppose basic bulk RNA-Seq pipeline, like hisat2 → featureCounts → DESeq2. In this situation, I want to draw PCA, dendrogram, co-scatter plot and heatmap.
Now, I am using normalization like below.
- PCA analysis:R function prcomp( data, scale = TRUE)
- dendrogram: No ( I use distance calculated from raw count matrix )
- co-scatter plot:I have no idea which method I should use
- heatmap:Z-score calculated from raw count matrix
Then, I want to ask some questions.
- Is my normalization appropriate ?
- Which method is good for co-scatter plot ?
- I could understand Z-score, but in other method, what is objective and goals in normalization?
- Why some methods want to use log value ? Also, doesn't meaning of expression value loose by normalization ?
Thanks
Please read the DESeq2 manual ( http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html ), it is all explained in there. It offers a convenience function for PCA (
plotPCA
) and a couple of normalization methodsvst
/rlog
upstream of applications such as PCA/clustering or other machine learning applications.Whatever you do in bioinformatics, no clustering/analysis will ever be done on raw data. With that I mean that one always has to normalize data prior to any analysis. DESeq2 itself accepts raw counts, will then normalize internally followed by differential analysis. Heatmaps can indeed be based on the Z-score but this should be done on log2-transformed normalized counts. A log-like transformation that both normalizes the counts and transformed to log-like scale is e.g.
vst
.Thanks. So, in bulk RNA-Seq, processing raw count matrix by vst or rlog before any analisys is standard, right ?
In any *-seq you have to normalize. Please read e.g. https://peerj.com/preprints/27283/ and get a solid background before analyzing data.
To me, all the methods you mentioned above is 'scaling' method, not normalization.
What is the main difference between "scaling" and "normalization" ?