What about this approach based on edgeRUserGuide 3.3.1? It can be easily expanded to more complicated designs...
targets<- data.frame(condition= rep(c('A', 'B', 'C'), each= 2),
row.names= c('sampleA1', 'sampleA2', 'sampleB1',
'sampleB2', 'sampleC1', 'sampleC2'))
# > targets
# condition
# sampleA1 A
# sampleA2 A
# sampleB1 B
# sampleB2 B
# sampleC1 C
# sampleC2 C
matrixOfRawCounts<- matrix(rpois(n= 600, 100), ncol= 6)
Group<- as.factor(rep(c('A', 'B', 'C'), each= 2))
design <- model.matrix(~0+Group)
colnames(design) <- levels(Group)
row.names(design)<- row.names(targets)
#> design
# A B C
# sampleA1 1 0 0
# sampleA2 1 0 0
# sampleB1 0 1 0
# sampleB2 0 1 0
# sampleC1 0 0 1
# sampleC2 0 0 1
# attr(,"assign")
# [1] 1 1 1
# attr(,"contrasts")
# attr(,"contrasts")$Group
# [1] "contr.treatment"
## Now decide your contrasts of interest
contrasts<- makeContrasts(
A_vs_B= A - B,
A_vs_C= A - C,
A_vs_BC= A - (B+C)/2,
# etc...
levels= design
)
# Contrasts
# Levels A_vs_B A_vs_C A_vs_BC
# A 1 1 1.0
# B -1 0 -0.5
# C 0 -1 -0.5
## Fit model
d <- DGEList(counts = matrixOfRawCounts, group = Group)
d <- calcNormFactors(d)
d<- estimateGLMCommonDisp(d, design)
d<- estimateGLMTagwiseDisp(d, design)
fit<- glmFit(d, design)
## Get DEGs
lrt<- glmLRT(fit, contrast= contrasts[, "A_vs_B"])
detable<- topTags(lrt)
lrt<- glmLRT(fit, contrast= contrasts[, "A_vs_C"])
detable<- topTags(lrt)
lrt<- glmLRT(fit, contrast= contrasts[, "A_vs_BC"])
detable<- topTags(lrt)
What's right or wrong will depend on the question you want to answer. So base your design on that.
Thanku soo much martombo and devon... I actually want to find the survival genes among all these 3 treatments.
Ok I will be using Anova like test from edgeR, but do I have to consider intercept or not.
or
Which option should I use, for me I think, what devon said, as it represents the experimental design well.
What does
coef=2:3
means?Firstly, it depends on what one means by "survival genes". If I were to write that, I would mean that I have patients with some disorder who had undergone a variety of different treatments, of variable effectiveness, and that I'm looking for changes that correlate with favorable prognosis within/between treatments. We only know that you have 3, presumably non-overlapping, groups so it's difficult to give too much insight there.
What I suspect you'll actually want to do is make 3 pairwise comparisons between the various groups. The simplest way to do that is
design=model.matrix(~0+group, data=y$samples)
and then use some contrasts for each comparisons. There's very likely an example of this in the edgeR user guide. Then you'll have a list of AvsB, AvsC and BvsC differences, which is probably most useful.The coef=2:3 bit makes sense if you allow for an intercept. Basically it's looking for difference between calling all the samples groupA and giving them their proper assignments. This is effectively how a 1-way ANOVA works, though it's debateable how biologically relevant the results would be (perhaps it's very informative, perhaps not, it depends on the exact nature of the study). I should note that if you do allow for an intercept, then coef=2 would be the genes DE in groupB vs groupA and coef=3 would be those DE in groupC vs. A. The results this way should be identical to using contrasts, so at the end of the day either will work.
BTW, it's useful to know what an intercept is doing in a model matrix. Basically, it's a group that applies to all of your sample and against which all of your other groups are compared. In your case, this would be groupA. An intercept nicely matches experiments with factorial designs such as an experiment where WT and Mutant animals are treated with a drug or control. That's a classical 2x2 factorial design, where the intercept would be WT animals treated with control, since you'd be interested in changes relative to that. When you have multiple groups that you want to directly compare, then it's usually less confusing to just omit the intercept, since there's no single group against which things are being compared. You can usually doing things with/without an intercept and get the same results, but you'll be best served by not departing too much from the actual experimental design (doing so is a quick way to get confused about what the results actually mean).
I used these command.. Why am I getting the p value and FDR all zero??????
Your samples are only in one group! It looks like there's a problem with y$samples.
Oh, actually, as I mentioned in my comment above, the coef=2:3 part won't make sense without an intercept. That's likely the cause of the wonky output (I'm guessing that there are samples in each of the groups, just that they're not shown).
oh No actually I do have samples in all the groups. It's just that I sorted them, that's why the
head(design)
just showing samples in one group).Ok now I did include the intercept and used coef 2:3