Hi,
I faced a similar issue. I think you can use DESeq2 as a starting point, but then need to do some manual adjustments. I posted to DESeq2 support at https://support.bioconductor.org/p/9138190/ but so far got no response whether the authors agree it is a sensible approach, so some caution is warranted.
EDIT: Mike Love suggested using controlGenes
when running estimateSizeFactors
in front of DESeq()
instead of the method below. Both have some advantages, see the above-linked bioconductor thread for details.
Assuming a design of counts ~ condition
the null hypothesis is that for each gene A
and a fixed reference B
:
log(mu_A_condition1) - log(mu_B_condition1) >= log(mu_A_condition2) - log(mu_B_condition2)
.
My understanding is that this null hypothesis can actually be tested with DESeq2 with a little tweak, because with dummy coding it translates to beta_A_condition2 - beta_B_condition2 <= 0
, so I can get the maximum likelihood estimate by just subtracting the two beta
values given by DESeq2, get the SE by taking sqrt(beta_A_SE^2 + beta_B_SE_^2)
(as the betas should have 0 covariance, right?) which is all I need to compute the Wald statistic and then I can once again carry on the same way DESeq2 does (p-adjustment, shrinkage, ...)
Here's my R code that performs this - you give the value of results(dds)
for the coefficient you are trying to compare and the name of the reference gene. It then computes estimates for all genes against the reference, performs p-value adjustments and shrinkage with ashr
change_from_reference <- function(results, reference, altHypothesis = "greater" , threshold = 0) {
reference_beta <- results[reference, "log2FoldChange"]
reference_beta_SE <- results[reference, "lfcSE"]
res_others <- results[rownames(results) != reference,]
LFC <- res_others$log2FoldChange - reference_beta
SE <- sqrt(res_others$lfcSE ^ 2 + reference_beta_SE ^ 2)
T <- threshold
#Taken from on https://github.com/mikelove/DESeq2/blob/a5941b305b597556d47c68df0b922a0a20b41fb6/R/results.R#L481
pfunc <- function(q) pnorm(q, lower.tail=FALSE)
if (altHypothesis == "greaterAbs") {
newStat <- sign(LFC) * pmax((abs(LFC) - T)/SE, 0)
newPvalue <- pmin(1, 2 * pfunc((abs(LFC) - T)/SE))
} else if (altHypothesis == "lessAbs") {
newStatAbove <- pmax((T - LFC)/SE, 0)
pvalueAbove <- pfunc((T - LFC)/SE)
newStatBelow <- pmax((LFC + T)/SE, 0)
pvalueBelow <- pfunc((LFC + T)/SE)
newStat <- pmin(newStatAbove, newStatBelow)
newPvalue <- pmax(pvalueAbove, pvalueBelow)
} else if (altHypothesis == "greater") {
newStat <- pmax((LFC - T)/SE, 0)
newPvalue <- pfunc((LFC - T)/SE)
} else if (altHypothesis == "less") {
newStat <- pmin((LFC + T)/SE, 0)
newPvalue <- pfunc((-T - LFC)/SE)
} else {
stop("Invalid altHypothesis")
}
ashrfit <- ashr::ash(LFC, SE,
mixcompdist="normal", method="shrink")
data.frame(gene = rownames(res_others),
estimate = LFC, shrunkEst = ashrfit$result$PosteriorMean,
shrunkSE = ashrfit$result$PosteriorSD, p.adj = p.adjust(newPvalue, method = "BH"))
}