Question

Including 2 factors vs 1 factor in Design Formula for DESeq2

0

Entering edit mode

4.5 years ago

wang.hanyin • 0

Dear community members,

I have 12 Nanostring samples from lymph nodes biopsy on 12 different patients. These 12 samples have 2 pathology types and 2 treatment types. Both pathology and treatment are known to affect RNA expression. I am hoping to exam differently expressed genes between two pathologies and also between two treatments. My question is that, when I am running differential expression analysis on DESeq2, should I : (1) include both treatment and pathology in the design formula (this is with Design = ~treatment + pathology, and to include all12 patients in the comparison) ? (2) alternatively, should I only include one factor in the design formula, but stratify samples when doing comparison (for example with Design = ~pathology, I will do DE analysis for the 6 patients with treatment A, then do another analysis for the rest 6 patients with treatment B).

The reason why I am even thinking about the second approach, is because I was told that my study is not designed for 2 factor comparison. However I personally feel the first approach would be good enough.

Deeply appreciate any assistance!

R sequencing • 893 views

ADD COMMENT • link updated 4.5 years ago by swbarnes2 15k • written 4.5 years ago by wang.hanyin • 0

0

Entering edit mode

Can you show your design ? Something like:

Patient    Treatment     Pathology
1          A             I
2          B             II
3          B             I

ADD REPLY • link 4.5 years ago by Carlo Yague 9.0k

0

Entering edit mode

Thank you for the kind response Carlo. The design looks like this:

        Pathology   Treatment
Sample1     A           No
Sample2     B           No
Sample3     A           No
Sample4     A           Yes
Sample5     A           No
Sample6     A           Yes
Sample7     A           No
Sample8     A           No
Sample9     A           Yes
Sample10    B           No
Sample11    B           No
Sample12    A           Yes
Sample13    B           Yes

Sorry I actually got 13 samples. These are clinical samples so not balanced in terms of treatment and pathology distribution.

Many thanks!

ADD REPLY • link 4.5 years ago by wang.hanyin • 0

score 1 · Answer 1 · 2021-02-09

1

Entering edit mode

4.5 years ago

swbarnes2 15k

In general, you include all samples in the same DESeq object, even if you are only comparing a subset to a subset. I'd say that is especially crucial with you have so few samples.

This is a really small number of sample for two factors, but removing samples from the analysis won't help. You should try ~ treatment + pathology, but it might be a problem that you have only 1 B-Yes sample. But if you had zero, the you'd really be in trouble.

ADD COMMENT • link 4.5 years ago by swbarnes2 15k

0

Entering edit mode

Thank you so much for the kind reply swbarenes2!

I am thinking about the same approach. Use ~ treatment + pathology as design formula for both treatment (with contrast=c("treatment", "yes", "no")) and pathology (with contrast=c("pathology", "A", "B")) comparison. In this way, the other factor should have been "adjusted". Deeply appreciate the kind suggestion.

ADD REPLY • link 4.5 years ago by wang.hanyin • 0