Question

DESeq2 design question

0

Entering edit mode

3.3 years ago

Lepomis_8 ▴ 30

I have a count matrix from an RNA-seq experiment that I'd like to normalize using DESeq2 and perform DE analysis on. My code is below:

dds <- DESeqDataSetFromMatrix(countData = cts,
                              colData = coldata,
                              design= ~ condition)

My experiment is performed over two time periods, week1 (with treated vs untreated) and week2, (untreated vs untreated). Samples were collected at the end of week 1 and week 2 without replacement. So essentially, week 2 we should see the reversal of any unregulated genes from week 1 (and the data is clustering this way).

I have two possible coldata files

coldata1

sample_id   condition   week
treated1    treated 1
treated2    treated 1
treated3    treated 1
untreated1  untreated   1
untreated2  untreated   1
untreated3  untreated   1
treated4    treated 2
treated5    treated 2
treated6    treated 2
untreated4  untreated   2
untreated5  untreated   2
untreated6  untreated   2

coldata2

sample_id   condition   week
treated1    treatedA    1
treated2    treatedA    1
treated3    treatedA    1
untreated1  untreated   1
untreated2  untreated   1
untreated3  untreated   1
treated4    treatedB    2
treated5    treatedB    2
treated6    treatedB    2
untreated4  untreated   2
untreated5  untreated   2
untreated6  untreated   2

So coldata2 would have three treatments instead of two. I'm a bit lost on which is better, and what the best way to fill the design section. I was thinking about making it time-series, but since the treatment was reversed, I'm not sure it's appropriate.

Any help would be greatly appreciated! Apologies if it is not clear, please let me know and I'll try to reexplain.

Edit, for clarification:

During week1: treated vs untreated samples. End of week 1: harvested half of the samples and isolated RNA, etc. During week2: untreated (were treated in week 1) vs untreated (were untreated in week 1). End of week 2: harvested rest of samples and terminated experiment.

RNA-seq DESeq2 R • 1.1k views

ADD COMMENT • link updated 3.3 years ago by rodolfo.peacewalker ▴ 390 • written 3.3 years ago by Lepomis_8 ▴ 30

0

Entering edit mode

My experiment is performed over two time periods, week1 (with treated vs untreated) and week2, (untreated vs untreated)

Did you also mean treated vs untreated for week 2?

ADD REPLY • link 3.3 years ago by rpolicastro 13k

0

Entering edit mode

No, both were untreated. Essentially, week 2 was reversing the environmental stressor that was performed during week 1. It wasn't a staggered reversal.

ADD REPLY • link 3.3 years ago by Lepomis_8 ▴ 30

0

Entering edit mode

The design formula depends on your research question. I suggest you to take a look at this vignette to clarify what is the best design to answer your question. Respect to coldata2, I suggest you collapse the factors of both variables (i.e. treatedA, treatedB, untreated, 1, and 2) into a single one by creating a new column called group.

Best regards!

ADD REPLY • link 3.3 years ago by rodolfo.peacewalker ▴ 390