I want to use DESeq2 dispersion results to model random variables that are distributed according to negative binomial. According to the DESeq2 paper, in page 2
Within-group variability, i.e., the variability between replicates,
is modeled by the dispersion parameter αi, which
describes the variance of counts via Var K_ij = μ_ij + α_i μ^2_ij.
Which suggests that the Negative Binomial parameterization, r = 1/α. Please let me know if my understanding about this is correct or not.
Okay, so I found the source of confusion, the reciprocal of r is called \alpha.
So DESeq2 reports this \alpha. I guess zero dispersion suggests that variance and mean are same that is \mu = \sigma^2. So it is no longer a negative binomial distribution. A possible hack I can think of is putting a very small value to navigate divide by zero overflow. Please suggest if there is a better way.
The variance can then be written m + m2/r. Some authors prefer to set α = 1/r, and express the variance as m + α m2. In this context, and depending on the author, either the parameter r or its reciprocal α is referred to as the “dispersion parameter”, “shape parameter” or “clustering coefficient”,[5] or the “heterogeneity”[4] or “aggregation” parameter.[6] The term “aggregation” is particularly used in ecology when describing counts of individual organisms. Decrease of the aggregation parameter r towards zero corresponds to increasing aggregation of the organisms; increase of r towards infinity corresponds to absence of aggregation, as can be described by Poisson regression.
Tagging: Michael Love