I have a design of 6 conditions with three replicates each, with expression levels for ~51000 transcription products. I'm using Limma's voom transformation to make my data approximately Gaussian, which by ocular pat-down seems to be the case:
The mean-variance estimation is visible here - kinda overdispersed in the mid range but it doesn't look critically bad to me:
I fit the linear model thusly:
> labels
[1] A A A C C C G G G L L L T T T U U U
Levels: A C G L T U
design <- model.matrix(~labels)
fit2 <- eBayes( lmFit(voomCounts, design) )
However, this yields the following p-value distribution from the pairwise t-tests (looks like a misspecified model)
The fit2$F.p.vals
has only 10-15 genes above 0.01, the rest are absolutely tiny; and when I plot some of the genes called for differential expression they look very unremarkable. What am I doing wrong?
Thanks so much :)
Thank you so much! This community is terrific.
To those that stumble on this answer by googling, here's how I generated the contrast matrix for 6 groups instead of two (my groups are A, C, G, L ,T, U):
nice. I didn't know about `combn`
Hi Mavendra,
I am following you approach but when I try to log2 normalize my data frame then lmFit gives the following error:
What I am doing is mentioned below:
.. and then I get the error. Actually I want to log2 normalize my data frame as is being done by GEO2R before applying limma.
Thanks, looking forward to your response.