Question

Why we perform coldata dataframe in normalization with DESeq2 on R?

0

Entering edit mode

2.2 years ago

Amr ▴ 180

Why we perform coldata dataframe in normalization with DESeq2 on R? what is the mathematical point of view while normalization? Is it differ if I change the condition in the coldata or not and why? How the coldata dataframe design condition doesn't matter for normalized file in GSEA?

Thanks

dataframe coldata DESeq2 normalization R GSEA • 680 views

ADD COMMENT • link 2.2 years ago by Amr ▴ 180

score 3 · Accepted Answer · 2022-09-21

You need to elaborate.

Why we perform coldata dataframe in normalization with DESeq2 on R?

DESeq2 doesn't perform any normalization of the metadata (i.e. colData) for each sample.

what is the mathematical point of view while normalization?

The mathematics of the DESeq2 approach are well explained in the paper.

Is it differ if I change the condition in the coldata or not and why?

Generally, no, the normalized count values will not change. See the relevant vignette section for more info:

"The design variables are not used when estimating the size factors, and counts(dds, normalized=TRUE) is providing counts scaled by size or normalization factors. The design is only used when estimating dispersion and log2 fold changes."

How the coldata dataframe design condition doesn't matter for normalized file in GSEA?

This question is unclear, but I assume you mean pre-ranked GSEA. Given this uses a list of genes ranked by some metric (usually the test statistic, fold-change, etc), the design matters because the rank metric will be dependent on the contrast being used.