For example, when you analyse Untreated condition vs Day1Treatment condition with only these samples in your design you will get some size factors.
If you have also set up a Day6Treatment condition in your design (but you don't use it yet), and want to to do the same comparison Untreated vs Day1Treatment, the size factors will change. (taking into account Day6)
I thought size factors was only relative to library size of its sample , so why do they change dependent on the design file even if you don't use a set of samples.
I was doing analyse Day1Treatment vs Untreated & Day6Treatment vs Untreated with two separate design files. But now I am wondering if it's better to have one design with all the samples, and do the two comparison to get same sizefactors because at the end of the day you finish with different differential genes detected.
Agreed. The idea is that you estimate a size factor for each column that best scales the datasets based on a large set of genes that do not change upon conditions. Given that you do not have samples with extreme global changes, it is probably the best to have as many samples in the matrix as possible. This probably produces more robust size factors than with only two or three samples.
Ok I got it , but do you know how is it computed ?
It is described in the original DESeq paper https://genomebiology.biomedcentral.com/articles/10.1186/gb-2010-11-10-r106, and I think the same method is used in DESeq2.
Yes, it is actually pretty simple but powerful from the concept. Check out StatQuest for a nice explanation.