Dear users,
I am struggling to understand if my design is correct. I found edgeR section 3.3.1 similar to my situation but I am not that confident.
Here is my experimental design:
samples_table
sampleId cellLine treatment time IC
s1 a vehicle 0 S
s2 a drug 48 S
s3 a drug 168 S
s4 b vehicle 0 S
s5 b drug 48 S
s6 b drug 168 S
s7 c vehicle 0 S
s8 c drug 48 S
s9 c drug 168 S
s10 d vehicle 0 S
s11 d drug 48 S
s12 d drug 168 S
s13 e vehicle 0 R
s14 e drug 48 R
s15 e drug 168 R
s16 f vehicle 0 R
s17 f drug 48 R
s18 f drug 168 R
I have 6 cell lines treated with a drug and RNA sequenced after 48 and 168 hours of treatment. Last column indicates if the cell line is susceptible or resistant to another compound.
I would like to find how resistant cell lines differentially respond to the drug at 48 and/or at 168 hours compared to resistant ones.
Here is my approach:
group <- factor(paste(samples_table$IC, samples_table$time, sep="."))
y <- DGEList(counts.keep, group=group)
y <- calcNormFactors(y)
design <- model.matrix(~0 + group)
colnames(design) <- levels(group)
y <- estimateDisp(y, design)
# Quasi-likelihood test
fit <- glmQLFit(y, design)
design
R.0 R.168 R.48 S.0 S.168 S.48
1 0 0 0 1 0 0
2 0 0 0 0 0 1
3 0 0 0 0 1 0
4 0 0 0 1 0 0
5 0 0 0 0 0 1
6 0 0 0 0 1 0
7 0 0 0 1 0 0
8 0 0 0 0 0 1
9 0 0 0 0 1 0
10 1 0 0 0 0 0
11 0 0 1 0 0 0
12 0 1 0 0 0 0
13 1 0 0 0 0 0
14 0 0 1 0 0 0
15 0 1 0 0 0 0
attr(,"assign")
[1] 1 1 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$group
[1] "contr.treatment"
my.contrasts <- makeContrasts(
S48 = S.48 - S.0,
S168 = S.168 - S.0,
S168vs48 = S.168 - S.48,
R48 = R.48 - R.0,
R168 = R.168 - R.0,
R168vs48 = R.168 - R.48,
RvsS.0 = R.0 - S.0,
RvsS.48 = (R.48 - R.0) - (S.48 - S.0),
RvsS.168 = (R.168 - R.0) - (S.168 - S.0),
all = (R.48 + R.168 - R.0) - (S.48 + S.168 - S.0),
levels = design
)
# to find genes that differentially respond at 48h between resistant and susceptible cell lines
qlf <- glmQLFTest(fit,
contrast = my.contrasts[ , "RvsS.48"])
topTags(qlf)
What do you think? Shouldn't I account the fact that I have different cell lines as a sort of batch effect?
Thanks a lot
Pietro
PS: cross posted to https://support.bioconductor.org/p/119310/
As you posted on Bioconductor, Aaron or Gordon will likely respond, and their answer will supersede any answer given here. On face value —yes— you should be adjusting for cell-line by, for example, including
cellLine
in your design formula. So, something like:Before doing this, I would also just check via a PCA bi-plot to see how
cellLine
is distributed across your samplesThanks Kevin,
Tried that, but it throws an "Design matrix not of full rank" error.
You can't include both
cellLine
andIC
, they're mutually exclusive.Hi Devon,
Realized that. So, what do you suggest to account for
cellLine
in my design?Thanks
Remove
IC
and perform any relevant grouping of it as needed during a contrast.You mean
?
And then I have
How do I specify that I want time48 only for cell lines
e
andf
against time48 for the other 4 cell lines? ThanksIf you want specific group comparisons like that it's more convenient to make groups like
a_time48
,a_time168
and use~0 + group
as a model.