You need to elaborate.
Why we perform coldata dataframe in normalization with DESeq2 on R?
DESeq2 doesn't perform any normalization of the metadata (i.e. colData
) for each sample.
what is the mathematical point of view while normalization?
The mathematics of the DESeq2 approach are well explained in the paper.
Is it differ if I change the condition in the coldata or not and why?
Generally, no, the normalized count values will not change. See the relevant vignette section for more info:
"The design variables are not used when estimating the size factors, and counts(dds, normalized=TRUE) is providing counts scaled by size or normalization factors. The design is only used when estimating dispersion and log2 fold changes."
How the coldata dataframe design condition doesn't matter for normalized file in GSEA?
This question is unclear, but I assume you mean pre-ranked GSEA. Given this uses a list of genes ranked by some metric (usually the test statistic, fold-change, etc), the design matters because the rank metric will be dependent on the contrast being used.
Thank you so much, you are genius