I have RNA-seq data from two treatment groups (three replicates each) and I am trying to replicate the likelihood ratio test outlined on page 1516 of this paper by Marioni et al: http://www.ncbi.nlm.nih.gov/pubmed/18550803. So far I have been able to fit a Poisson regression model for each gene using the following function and command:
poissonGLM <- function(gene) {
data <- data.frame(treatment=pdata$treatment, count=gene)
model <- glm(count ~ treatment, data=data, family=poisson)
mean <- predict(model.pois, type="response"),
stdev <- predict(model.pois, type="response", se.fit=T)$se.fit
}
results <- apply(edata, 1, poissonGLM)
For each gene, this computes the expected mean and expected variance for each treatment group. However, I'm unsure how to use this model to compute the maximum likelihood estimates both under the null and alternative hypothesis (as described in the paper), and generate a P-value from the Chi-squared distribution.
So the intercept only model is what Marioni et al. refer to as the null hypothesis, while the model with the treatment group specified refers to the alternative hypothesis?
The P-value is therefore the following?
They go on to estimate fold change by fitting the model under the alternative hypothesis, which would mean calculating the ratio of the expected means I've already computed?
Yes the null hypothesis is the intercept only model. Likelihood ratio tests use the log-likelihood and not the predicted count. That means using the
logLik
function. This means the statistic is-2*(logLik(model.null) - logLik(model.treatment))
where the difference is subtraction because the log of ratios equal the log of the numerator minus the log of the denominator (a property of logarithms).The p-value is just the probability of observing that chi-squared statistic or greater from a chi-squared distribution with d.f. = 1 in your case. All the above work should give you the same result as the anova function I had mentioned initially. So yes the p-value can be extracted like you did above. Presumably thats what they mean by fold change but I haven't read through that paper before.