Question

Using the dispersion result from DESeq2 from one dataset to generate negative binomial sample

0

Entering edit mode

9.0 years ago

hirak.sarkar ▴ 20

Hi,

I want to use DESeq2 dispersion results to model random variables that are distributed according to negative binomial. According to the DESeq2 paper, in page 2

Within-group variability, i.e., the variability between replicates, is modeled by the dispersion parameter αi, which describes the variance of counts via Var K_ij = μ_ij + α_i μ^2_ij.

Which suggests that the Negative Binomial parameterization, r = 1/α. Please let me know if my understanding about this is correct or not.

RNA-Seq DESeq2 dispersion negative-binomial • 2.1k views

ADD COMMENT • link 9.0 years ago by hirak.sarkar ▴ 20

0

Entering edit mode

Tagging: Michael Love

ADD REPLY • link 9.0 years ago by GenoMax 153k

score 2 · Accepted Answer · 2016-07-28

Okay, so I found the source of confusion, the reciprocal of r is called \alpha. So DESeq2 reports this \alpha. I guess zero dispersion suggests that variance and mean are same that is \mu = \sigma^2. So it is no longer a negative binomial distribution. A possible hack I can think of is putting a very small value to navigate divide by zero overflow. Please suggest if there is a better way.

The variance can then be written m + m2/r. Some authors prefer to set α = 1/r, and express the variance as m + α m2. In this context, and depending on the author, either the parameter r or its reciprocal α is referred to as the “dispersion parameter”, “shape parameter” or “clustering coefficient”,[5] or the “heterogeneity”[4] or “aggregation” parameter.[6] The term “aggregation” is particularly used in ecology when describing counts of individual organisms. Decrease of the aggregation parameter r towards zero corresponds to increasing aggregation of the organisms; increase of r towards infinity corresponds to absence of aggregation, as can be described by Poisson regression.