Question

Two factors plus time, LRT with DEseq2

0

Entering edit mode

20 months ago

boczniak767 ▴ 870

Hi,

Experiment design

I will have results from three-way RNA-seq experiment. Now I'm trying to prepare for it's analysis.

There will be 4 maize lines (for now denoted: "A", "B", "C", "D") and two treatments: cold ("x") and regrowth ("c"). For each treatment there will be say 8 time points (in my simulated data only 4).

In fact plants will be grown in cold ("x") treatment first, after which the temperature will be raised for regrowth ("c"). At each time point leaves from three plant per line will be pooled as a sample. And these plants would be discarded.

So for each line it could be either analyzed in such a way

that cold and regrowth profiles are joined or
as a separate cold and regrowth profiles.

I think 1. would be better for clustering, but option 2. seems better from the testing point of view.

Biological questions

1. What are differences between treatment (cold vs regrowth) time profiles for each line? It could be analyzed for each line separately, say in DEseq2.

2. Do lines react differently to treatments? It involves line:treatment interaction, so it could be only tested in two-way analysis.

My attempts

I tried to expand the example from DEseq2 to include second factor https://master.bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#time-course-experiments

As three-way analysis gave "matrix not full rank" error I've joined line and treatment in one variable, named group as proposed in http://127.0.0.1:13491/library/DESeq2/doc/DESeq2.html#interactions

So my colData is

> sym.coldata
       line treatment time group
Ax15.1     A       x   15    Ax
Ax15.2     A       x   15    Ax
Ax15.3     A       x   15    Ax
Ax22.1     A       x   22    Ax
Ax22.2     A       x   22    Ax
Ax22.3     A       x   22    Ax
Ax29.1     A       x   29    Ax
Ax29.2     A       x   29    Ax
Ax29.3     A       x   29    Ax
Ax36.1     A       x   36    Ax
Ax36.2     A       x   36    Ax
Ax36.3     A       x   36    Ax
Ac15.1     A       c   15    Ac
Ac15.2     A       c   15    Ac
Ac15.3     A       c   15    Ac
Ac22.1     A       c   22    Ac
Ac22.2     A       c   22    Ac
Ac22.3     A       c   22    Ac
Ac29.1     A       c   29    Ac
Ac29.2     A       c   29    Ac
Ac29.3     A       c   29    Ac
Ac36.1     A       c   36    Ac
Ac36.2     A       c   36    Ac
Ac36.3     A       c   36    Ac
Bx15.1     B       x   15    Bx
Bx15.2     B       x   15    Bx
Bx15.3     B       x   15    Bx
Bx22.1     B       x   22    Bx
Bx22.2     B       x   22    Bx
Bx22.3     B       x   22    Bx
Bx29.1     B       x   29    Bx
Bx29.2     B       x   29    Bx
Bx29.3     B       x   29    Bx
Bx36.1     B       x   36    Bx
Bx36.2     B       x   36    Bx
Bx36.3     B       x   36    Bx
Bc15.1     B       c   15    Bc
Bc15.2     B       c   15    Bc
Bc15.3     B       c   15    Bc
Bc22.1     B       c   22    Bc
Bc22.2     B       c   22    Bc
Bc22.3     B       c   22    Bc
Bc29.1     B       c   29    Bc
Bc29.2     B       c   29    Bc
Bc29.3     B       c   29    Bc
Bc36.1     B       c   36    Bc
Bc36.2     B       c   36    Bc
Bc36.3     B       c   36    Bc
Cx15.1     C       x   15    Cx
Cx15.2     C       x   15    Cx
Cx15.3     C       x   15    Cx
Cx22.1     C       x   22    Cx
Cx22.2     C       x   22    Cx
Cx22.3     C       x   22    Cx
Cx29.1     C       x   29    Cx
Cx29.2     C       x   29    Cx
Cx29.3     C       x   29    Cx
Cx36.1     C       x   36    Cx
Cx36.2     C       x   36    Cx
Cx36.3     C       x   36    Cx
Cc15.1     C       c   15    Cc
Cc15.2     C       c   15    Cc
Cc15.3     C       c   15    Cc
Cc22.1     C       c   22    Cc
Cc22.2     C       c   22    Cc
Cc22.3     C       c   22    Cc
Cc29.1     C       c   29    Cc
Cc29.2     C       c   29    Cc
Cc29.3     C       c   29    Cc
Cc36.1     C       c   36    Cc
Cc36.2     C       c   36    Cc
Cc36.3     C       c   36    Cc
Dx15.1     D       x   15    Dx
Dx15.2     D       x   15    Dx
Dx15.3     D       x   15    Dx
Dx22.1     D       x   22    Dx
Dx22.2     D       x   22    Dx
Dx22.3     D       x   22    Dx
Dx29.1     D       x   29    Dx
Dx29.2     D       x   29    Dx
Dx29.3     D       x   29    Dx
Dx36.1     D       x   36    Dx
Dx36.2     D       x   36    Dx
Dx36.3     D       x   36    Dx
Dc15.1     D       c   15    Dc
Dc15.2     D       c   15    Dc
Dc15.3     D       c   15    Dc
Dc22.1     D       c   22    Dc
Dc22.2     D       c   22    Dc
Dc22.3     D       c   22    Dc
Dc29.1     D       c   29    Dc
Dc29.2     D       c   29    Dc
Dc29.3     D       c   29    Dc
Dc36.1     D       c   36    Dc
Dc36.2     D       c   36    Dc
Dc36.3     D       c   36    Dc

Analysis

I've successfully created dds object of DEseq class

> dds2=DESeqDataSetFromMatrix(countData=cts2, colData=sym.coldata, design= ~ 0 + group + time + group:time)
converting counts to integer mode

I've removed intercept to make contrasts more readable as proposed here https://bioinformatics.stackexchange.com/a/154/2481 and here https://ucdavis-bioinformatics-training.github.io/2019-March-Bioinformatics-Prerequisites/thursday/linear_models.html in "Another parametrization" section.

Likelihood Ratio Test on reduced model

> dds2=DESeq(dds2, test="LRT", reduced = ~ 0 + group + time)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
> res2=results(dds2)

From this point I don't know what to do

I can extract coefficients

> resultsNames(dds2)
 [1] "groupAc"        "groupAx"        "groupBc"        "groupBx"       
 [5] "groupCc"        "groupCx"        "groupDc"        "groupDx"       
 [9] "time22"         "time29"         "time36"         "groupAx.time22"
[13] "groupBc.time22" "groupBx.time22" "groupCc.time22" "groupCx.time22"
[17] "groupDc.time22" "groupDx.time22" "groupAx.time29" "groupBc.time29"
[21] "groupBx.time29" "groupCc.time29" "groupCx.time29" "groupDc.time29"
[25] "groupDx.time29" "groupAx.time36" "groupBc.time36" "groupBx.time36"
[29] "groupCc.time36" "groupCx.time36" "groupDc.time36" "groupDx.time36"

Of course the p-value from the test would say if there is any effect of group across time (not informative at all). So I'd have to make my own contrasts.

I understand that eg. groupAc is a coefficient for group "Ac" (line "A" in treatment "c") at time "15" (which, as I can see has been set as reference time). So to get treatment effect for line "A" at time "15" I'd have to make following contrast

res15 <- results(dds2, contrast = c("group", "Ax", "Ac"))

Similarly, to get treatment effect for line "B" at time "22" I'd have to compare "groupBc.time22" and "groupBx.time22".

I wonder, where is coefficient for "time15"?

I understand that for my Biological questions given above I'd have to make more complicated contrasts, involving combination (sums?) of coefficients. Is it possible in DEseq2? How to do that? I also think about maSigPro package, but for now I'd like to try DEseq2.

ADD COMMENT • link 20 months ago by boczniak767 ▴ 870