Question

How to subset dds object properly comparing two conditions

0

Entering edit mode

16 months ago

dylannicoembros • 0

I have some data divided in 3 conditions: A,B,C. I would like to make a DE with only the first two condition, removing the samples with condition C. A possible solution I found is to solve the problem at the root, so removing the samples directly on the row data I am reading, but can it be dangerous ? (sorting may be corrupted). On the other hand, if I subset the dds object after the creation, I will have always three condition in my levels, since I created it considering all the samples, and some errors occurs.

What should I do ?

UPDATE: Using results() with contrasts parameter it's not an option for me, because for my understanding, Deseq() estimates statistics considering all samples. So I thought that would be an incorrect way to do my analysis.

R DESeq2 bioconductor • 1.0k views

ADD COMMENT • link updated 16 months ago by Ram 44k • written 16 months ago by dylannicoembros • 0

0

Entering edit mode

Why do you need to discard samples with condition C ? (biological, technical reasons ?)

ADD REPLY • link 16 months ago by Basti ★ 2.0k

0

Entering edit mode

I actually read that deseq2 performs estimates for single genes, so they suggest to create the dds with all the samples and then to subset later. I found the answer in the vignette faq.

ADD REPLY • link 16 months ago by dylannicoembros • 0

2

Entering edit mode

Your issue is typically described here : https://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#if-i-have-multiple-groups-should-i-run-all-together-or-split-into-pairs-of-groups

You need to check the within-group variability of your samples in condition C by PCA. If the variability is huge then you should start with the reduced dataset, otherwise you can use contrasts.

ADD REPLY • link 16 months ago by Basti ★ 2.0k